Search Engine Glossary - Search Engine Terminology You should Know
home >
resources >
search engine glossary
Here is an ever growing list of search engine terms you should
know. The search engine gloassary has been organized alphabetically.
If you would like to add a term to the list please email us at info@salientmarketing.com.
A - Search Terms
Adjacency
A property of the relationship between words in a search engine
(or directory) query. Search engines often allow users to specify
that words should be next to one another or somewhere near one another
in the web pages searched.
Agent Name Delivery
The process of sending search engine spiders to a tailored
page, yet directing your visitors to what you want them to see.
This is done using server side includes (or other dynamic content
techniques). SSI, for example, can be used to deliver different
content to the client depending on the value of HTTP_USER_AGENT.
Most normal browser software packages have a user agent string which
starts with "Mozilla" (coined from Mosaic and Godzilla).
Most search engine spiders have specific agent names, such as "Gulliver",
"Infoseek sidewinder", "Lycos spider" and "Scooter".
By switching on the value of HTTP_USER_AGENT (a process known
as agent detection), different pages can be presented at
the same URL, so that normal visitors will never see the page submitted
to search engines (and vice versa).
In practice this is somewhat simplistic. Some search engines pretend
to be "plain mozilla" browsers to prevent use of agent
name delivery. Effective use of agent name delivery can be very
difficult, and may not even work.
How do you spot agent name delivery at work? This is quite difficult,
as the owners of web pages using agent name delivery can control
what you see! You may be able to guess that a page is using this
technique if it appears to be indexed incorrectly or the title or
description don't match the page you see, but this could also have
been achieved by switching pages after the relevant search engine
has indexed it. If you really want to see the search engines' tailored
version of a page, write a program (e.g. a Perl script) to retrieve
the URL with HTTP_USER_AGENT set to each of the strings used by
the search engine spiders. If agent name delivery is in use, one
or more of the retrieved pages will be different to the others!
See also hidden text and IP delivery.
Altavista
A popular search engine with the largest database on the web, indexing
more than 140 million pages. Its main URL is http://www.altavista.com.
Until 1998, this search engine provided the search facility for
Yahoo. Altavista indexes all the words in a web page, and
new pages are normally added to the database fairly quickly, within
a couple of working days. You are asked to submit just the main
page of your site. The Altavista spider will then explore your site
and index a representative sample of the pages. Some problems with
spamming have been noticed. The use of keyword meta tags is penalised.
Altavista places various alternative options before its search results,
including suggested questions (using the Ask Jeeves service),
RealNames. Paid entries are beginning to appear at the start
of the search results.
AOL Netfind
The default search engine for users of the AOL internet service
provider, and hence a busy site. Its URL is http://www.netfind.com.
It is essentially the same engine as Excite.
Applet
A small program, often written in Java, which usually runs
in a web browser, as part of a web page. It is possible that the
use of such a program may cause spiders and robots to stop indexing
a page.
ArchitextSpider
The name of the Excite search engine's spider.
Ask Jeeves
A meta search engine which can be asked questions in English. This
service is also in use at Altavista. http://www.askjeeves.com.
B - Search Terms
Bait-and-Switch
The provision of one page for a search engine or directory and a
different page for other user agents at the same URL. Various methods
can be used, e.g. Agent Name Delivery or IP Delivery.
C - Search Terms
CGI
Common Gateway Interface - a standard interface between web server
software and other programs running on the same machine.
CGI Program
Strictly, any program which handles its input and output data according
to the CGI standard. In practice, CGI programs are used to handle
forms and database queries on web pages, and to produce non-static
web page content.
Channels, Channel listings
Lists of links to selected (and usually popular) web sites. The
links are maintained by search engines and directories and are sorted
into categories or channels. Sites are picked by a channel editor,
often because of a site's already high ranking with the search engines.
Some search engines and directories allow visitors to nominate sites
for inclusion in their channels.
Client
A computer, program or process which makes requests for information
from another computer, program or process. Web browsers are client
programs. Search engine spiders are (or can be said to behave as)
clients.
Click through
The process of clicking on a link in a search engine output page
to visit an indexed site.
This is an important link in the process of receiving visitors to
a site via search engines. Good ranking may be useless if visitors
do not click on the link which leads to the indexed site. The secret
here is to provide a good descriptive title and an accurate and
interesting description.
Cloaking
The hiding of page content. Normally carried out to stop page thieves
stealing optimized pages. See also Bait-and-Switch.
Clustering
The listing of only one page from each web site in a search engine
or directory's list of search results. This avoids occupation of
all the top results by a small number of web sites and makes the
list of results clearer and more useful to the user.
Comment
The HTML <!-- and --> tags are used to hide text from browsers.
Some search engines ignore text between these symbols but others
index such text as if the comment tags were not there. Comments
are often used to hide javascript code from non-compliant browsers,
and sometimes (notably on Excite) to provide invisible keywords
to some search engines.
Crawler
See Spider.
D - Search Terms
Dead Link
An internet link which doesn't lead to a page or site, probably
because the server is down or the page has moved or no longer exists.
Most search engines have techniques for removing such pages from
their listings automatically, but as the internet continues to increase
in size, it becomes more and more difficult for a search engine
to check all the pages in the index regularly. Reporting of dead
links helps to keep the indexes clean and accurate, and this can
usually be done by submitting the dead link to the search engine.
De-listing
The removal of pages from a search engine's index.
Removal can occur for various reasons, including unreliability
of the machine that hosts a site or because of perceived attempts
at spamdexing.
Description
Descriptive text associated with a web page and displayed, usually
with the page title and URL, when the page appears in a list of
pages generated by a search engine or directory as a result of a
query. Some search engines take this description from the DESCRIPTION
Meta tag - others generate their own from the text in the page.
Directories often use text provided at registration.
Direct Hit
A system which monitors the search engine users' selections from
search engine results, counting which results are clicked on most,
and how long visitors spend at that site, so as to improve relevancy.
Used by HotBot and as a plug-in to Apple's new innovative Sherlock
search system. See www.directhit.com.
Directory
A server or a collection of servers dedicated to indexing internet
web pages and returning lists of pages which match particular queries.
Directories (also known as Indexes) are normally compiled manually,
by user submission (such as at whatsnew.com), and often involve
an editorial selection and/or categorization process (such as at
LookSmart and Yahoo).
Dogpile
A meta search engine. Found at
http://www.dogpile.com.
Domain
A sub-set of internet addresses. Domains are hierarchical, and lower-level
domains often refer to particular web sites within a top-level domain.
The most significant part of the address comes at the end - typical
top-level domains are .com, .edu, .gov, .org (which sub-divide addresses
into areas of use). There are also various geographic top-level
domains (e.g. .ar, .ca, .fr, .ro etc.) referring to particular countries.
The relevance to search engine terminology is that web sites which
have their own domain name (e.g. http://www.yourname.com) will often
achieve better positioning than web sites which exist as a sub-directory
of another organization's domain (e.g. http://sub.yourname.com).
Doorway Page
See Gateway Page.
Dynamic content
Information on web pages which changes or is changed automatically,
e.g. based on database content or user information. Sometimes it's
possible to spot that this technique is being used, e.g. if the
URL ends with .asp, .cfm, .cgi or .shtml. It is possible to serve
dynamic content using standard (normally static) .htm or .html type
pages, though. Search engines will currently index dynamic content
in a similar fashion to static content, although they will not usually
index URLs which contain the ? character.
SalientMarketing is Search Engine Marketing and Market Research Firm.
Our staff of marketing and technology professionals deliver services that maximize
the return on investment (ROI) for online and offline marketing
initiatives. Our process of implementation is rooted in upfront research followed
by a measurement of success, ensuring that our strategies are on target and
exceed expectations. It's all about taking the guesswork of business success.
>> Click here for our list
of services or solutions that we provide
|