Planning And Conducting An Internet Search

Your search for a specific item in a world of information can be difficult,
especially if the search is done randomly and without any planning. This
section offers suggestions to the beginner on conducting a search in an
orderly and informed way.

The following are suggestions for those just starting to learn searching
the Web.

o Try to develop a general understanding of the search tools, process
and language. However, it is not necessary to know everything at the
beginning. Start with the information that you need to search subjects
of interest. You will find that your understanding will build as you
experience broadens.


o In the beginning, avoid searches for obscure information not likely to
be found unless you use sophisticated search methods. Once you master
simpler searches, you can move toward the more complex ones.


o Prefer to work first with those search tools that give the simplest
and clearest instructions. Present examples are Excite, Infoseek, and
Cyber411. The help section of the search tool's Home Page will usually
provide details that assist with the search. From these learn how best
to compose the query and focus the search.


o In time, learn to use the more advanced search operators, as these
will provide fewer and more relevant hits.


o If you know exactly what you are looking for, use a keyword search via
a Search Engine or a Multi-Engine search tool. If you are looking for
general information on a subject, conduct the search by using
Directory search tool such as Yahoo.

Searching By Keyword


There are various levels of complexity in conducting a keyword search,
begin with the simpler searches and work your way toward those that are
more complex.

1. Simple Searches

For search queries that don't require operators, such as single terms or
proper names, use a keyword search engine such as AltaVista or HotBot.
These search engines rate particularly high for completeness and currency.

For queries using a phrase, use quotes to enclose the phrase. This will
greatly reduce the number of hits and improve relevancy. Also, be sure to
capitalize proper names.

2. Moderately Complex Searches

A convenient method of conducting a moderately complex search is to use
search engines that have a common set of operators as found in Table 2.
These "Common Operators" briefly recapped are:

o Use quote marks around search words that belong together.
o Use [+] sign without a following space to require that returned
references contain the term.
o Use [-] sign without a following space to require that returned
references omit the term.
o Use lower case except with proper names. These are capitalized and
separated by commas.

For a quick and efficient search means , begin with a Multi-Engine search
tool such as Cyber411, using the above "Common Operators" to compose your
query. Savvy works well for this type of search, because it utilizes
several search tools simultaneously to provide a relatively short hit list
of high relevancy.

If Cyber411 does not provide adequate results, use one or more of the
preferred search engines singly. This approach provides many more hits and
therefore more opportunity to capture the information you want. Suitable
examples include Infoseek and HotBot.

One good way to use many search tools efficiently is to utilize the
All-In-One search tool. It works this way:

o Go to the All-In-One search tool and choose the WWW grouping. Scroll
to the first search tool which appears promising.
o Compose the query using the "Common Operators" described above, and
type it into the search box.
o Copy the query onto clipboard for later use.
o Click search and evaluate the hits.
o Go to the next promising search engine, and paste the query into the
query box. And so on.

Once set up, the procedure works rapidly. The slow part is evaluating the
hits. Be prepared for some hits to show up in the results of several search
tools. However, some hits will be unique to a search tool, among which may
be the reference of most interest to you.

The greater simplicity provided by the use of keyword searches has a
trade-off. It will normally produce a very large number of hits of
generally fewer relevance. Because the hits will be ranked according to
relevance, the first 30 hits or so are most likely to contain the most
relevant references.

3. Highly Complex Searches

These searches are for obscure information or difficult to define queries
and benefit from the use of a more sophisticated search engine. For these
more difficult searches, try the advanced mode of the selected search
engine and adhere to its instructions. This requires study of the help
section of the search tool and diligent use of its operators. AltaVista so
used is a powerful and effective search tool.

Directory And Directory/Keyword Searches

By comparison to Keyword searches, the procedure for a Directory search is
rather simple. Such searches are for browsing, where the paths they take
are from general subjects to increasingly more specific topics. Follow the
search path to the desired topic and then examine the hits that are
provided at each stop. The hits will normally contain links that will
further your search.

Directories depend on persons to update their databases, and therefore the
relevancy of the information they provide is high. However, it is achieved
at the expense of completeness and currency of the information in the
database. Conversely, search engines collect and update web sites
automatically, and therefore are more current and complete., but at the
expense of a much larger number of hits of generally fewer relevance.
Automatic updating of search engine databases occurs routinely, usually
within days. Directory references take considerably longer, normally weeks
and sometimes as long as months.

Today, some Directory search tools provide an option for switching to a
keyword search at each Directory stop along the way. This allows you to
narrow the search field to simplify your search. When choosing the keyword
option, compose the query in the search box provided and follow keyword
instructions. Excite and Yahoo are effective search tools having this
allied subject/keyword capability.

Evaluating Hits

This is usually the hardest and most time-consuming part of a search. The
number of hits you obtain can range from none to hundreds of thousands, and
their relevance or usefulness can vary from considerable to negligible.
There are some things you can do to help produce more relevant hits for the
fewest total number.

o Too many hits are caused by the use of queries that are too general.
Try using more specific terms. The more exact your query, the better
your results.
o Too few hits are usually caused by too restrictive a query. Broaden
your search by removing the least required keywords or operators.
o Try starting with a subject search and continue down the path to the
last relevant title. At this stop, switch to a keyword search. This
limits the search to the last subject title, which will reduce the
hits and improve their relevancy.
o Compose the query with the appropriate operators for the particular
search tool that you select. A large number of irrelevant hits are
often due to a powerful search engine misguided in its search.
o Narrow the scope of your search by choosing a specific field of search
offered by the search engine, such as a time period or geographical
area.

Success in any particular search query is usually more a question of which
search tool has the best database for the subject and how the information
is organized for retrieval. This is why it is often necessary to try a
number of different search tools when searching for obscure information.

Some search engines list the hits by titles, some by brief text and some
give you a choice. When available choose the brief text, as it is easier to
evaluate. Even so, it is often necessary to click the link to see the
entire document before you can assess its content. Some sites may not be of
apparent interest, but will contain links that have great relevancy. Some
searches yield the desired information quickly, and some you may just have
to plod through.

As you gain experience, you will find the search tools to use that are most
appropriate for your particular interests and how best to evaluate the
hits.

Summation

Learning to search the Web is an incremental process that builds with
experience. You will find that your search skills will increase as you gain
greater understanding of search terminology, search tools and their
intricacies and the way information is stored and retrieved. The learning
process is arduous; the reward is a world of information that is made
available to you

SEARCH TOOLS REFERENCES

This section provides a convenient way to access help and background
information on each of the search tools listed in Table 2. Because of rapid
changes in the search field, you will want to keep abreast of the changes
for the tools that you mostly use. The following explains terms used in
this section and provides some helpful hints.

Address: is the Web address or URL. You can access an address by clicking
it.

Automatic Document Scanning: This is the means of identifying, indexing and
cataloguing Web sites. It employs robots or spiders for scanning virtually
all web sites to augment and update the databases of search engines.

Bookmark: To access Home and Help Pages conveniently, create an address
folder for each under Bookmarks. This is done by going to the Home or Help
Pages via the links provided in this guide and adding them to the
appropriate bookmark folder.

Common Operators: We use this term to describe a set of most-used operators
of the popular search engines. Common Operators are generally compatible
with Multi-Engine search tools use as well. [See Searching By Keyword in
Section E for a description of their use]

Default: The operating mode when no other is specified.

Frame -based Information: That which resides in a box within a Web page.
Some search engines will not search within frames and therefore the
information there is not indexed and retrievable.

Full Text: Indicates every word in the text is scanned. The information
recorded is therefore potentially accessible via keyword use.

Home and Help Pages: Visit these Web pages for the search tools that you
use most to remain current. FAQ [Frequently Asked Questions] also contains
help and other useful information.

Relevance Ranking: Each search engine has its own way of assigning
relevance. Higher relevance is normally given to query terms in the title
and first few words in the document. For some search engines, proximity and
frequency of use are also factors. It is unusual that the best source ranks
first, unless the query terms are optimally located in the document.

SEARCH TOOLS

We recommend the following search tools, because each has somewhat
different capabilities and advantages. In this respect, they complement
each other, making it possible to find and retrieve even obscure
information In time, and by trial and error, you will learn which are the
best for your use and under what circumstances.

This Reference represents our understanding of present practices. Expect
the contents to change as search tools expand their scope and improve their
performance.

1. ALL-IN-ONE

o Home Page Address: www.albany.net/allinone/
o Help Page Address: None available.
o Search Method: Keyword
o Data Base System: That of the search engines employed
o Graphics: Bypassed [An advantage for slow-downloading computers]
o Operators: Common Operators suitable
o Other Comments: An efficient means of using many search tools of your
choice in quick succession. [Its use is described in Section E under
"Searching By Keyword".]

2. ALTAVISTA

o Home Page Address: www.altavista.digital.com
o Help Page Address: Simple Query-
www.altavista.digital.com/cgi-bin/query?pg=h. Advanced Query-
www.altavista.digital.com/cgi-bin/query?pg=ah
o Search Method: Keyword
o Data Base System: Full text, having presently the largest and most
inclusive indices.
o Graphics: Medium
o Operators: If used for a multiple term query without suitable
operators, it can produce an enormous number of irrelevant hits and
relatively few relevant ones. Go to "Help Page" addresses for details
on the use of its operators. Advanced Searches start with the same
operators as Simple Searches and build from there and are the means of
obtaining the highest number of relevant references for the fewest
hits.
o Special Features: Can specify images and text. Can limit by date.
o Other Comments: Has the reputation of having the most sophisticated
search system. Serves as the preferred default search engine for
Yahoo. Presently does not index frames.

3. EXCITE

o Home Page Address: www.excite.com
o Help Page Address: www.excite.com/Info/searching.html?a-n-t
o Search Method: Subject and Keyword
o Data Base System: Full-text search of over 50 million documents.
Titles are apparently not searched.
o Graphics: Moderate.
o Operators: Combines simple and advanced keyword searches. Supports
automatic stemming and sorting by site.
o Search Features: Offers keyword searches for literal and concept
queries, but does better with concept searches. Concept search is the
default. A concept search looks for ideas related to a literal query.
Use of Boolean Operators turns off concept searching.
o Other Comments: Excite search is very good for the beginner. It is
easy to use, its headings and links are well organized. and the
instructions for its use are clearly presented.

4. HOTBOT

o Home Page Address: www.hotbot.com
o Help Page Address: http://www.hotbot.com/Help/intro.html
o Search Method: Keyword
o Data Base System: Full-text search of over 54 million documents.
o Graphics: Low
o Operators: Supports simple and expert [advanced] searches. Provides
detailed instructions on use of operators under Help.
o Special Features: Provides pull-down menus and buttons for refining
and focusing search.
o Other Comments: Does not provide stemming nor does it index frames.

5. INFOSEEK

o Home Page Address: www.infoseek.com
o Help Page Address: http://www.infoseek.com/Help?pg=DChelp.html
o Search Method: Subject [Infoseek Select] and Keyword
o Data Base System: Full text. Ultrasmart/Ultraseek searches over 50
million Web pages.
o Graphics: Low
o Operators: Provides detailed instructions on use of operators under
Help.
o Special Features: Provides access to many services under "Smart
Information" on Help Page.
o Other Comments: New site submissions are added immediately. Caters to
the needs of both beginners and advanced users. Does not read frames
or support stemming.

6. LOOKSMART

o Home Page Address: www.looksmart.com
o Help Page Address: www.looksmart.com/h/info/whyls.html
o Search Method: Subject. Also provides a non-allied keyword option
o Data Base System: Not described
o Graphics: Very high
o Operators: There is no description on use of any operators for the
keyword option.
o Other Comments: Easy-to-use magazine format for subject path.
Considerable use of frames.

7. MAMMA

o Home Page Address: www.mamma.com
o Help Page Address: http://www.mamma.com/faq.html#why
o Search Method: Multi-Engine
o Data Base System: That of search engines employed
o Graphics: Low
o Operators: Common Operators are applicable.
o Special Features: Accommodates incorrect operators.
o Other Comments: Can conduct parallel searches of 7 major search
engines.

8. MAGELLAN

o Home Page Address: www.mckinley.com
o Help Page Address: None available
o Data Base System: Subject and keyword
o Graphics: medium
o Operators: No available description.
o Other Comments: Not actually a search engine but more like an on-line
guide. Magellan's strength is in the quality of its subject reviews.
It also does well on popular sites. Acquired by Excite, but so far it
continues to operate separately.

9. METACRAWLER

o Home Page Address: www.metacrawler.com
o Help Page Address: www.metacrawler.com/faq.html#searches
o Search Method: Multi-Engine
o Data Base System: That of search engines employed
o Graphics: Low
o Operators: Common Operators are applicable
o Other Comments: Conducts parallel searches of 7 major search engines.
Acquired in February 1997 by go-2-net which is presently seeking to
expand its coverage.

10. ONEKEY

o Home Page Address: www.onekey.com
o Help Page Address: www.onekey.com/live/smart.htm#smartdef
o Search Method: Keyword with an independent or non-allied subject
listing
o Data Base System: Not explained
o Graphics: High
o Operators: Provides simple options of either "all terms" or "any
terms".
o Special Features: Offers subjects of interest to children
o Other Comments: The subject listing provided appears to be carefully
categorized. Also provides a long list of topics which can be quite
useful, if any match your interest.

11. SAVVYSEARCH

o Home Page Address: guaraldi.cs.colostate.edu
o Help Page Address: guaraldi.cs.colostate.edu:2000/help/
o FAQ: guaraldics.colostate.edu:2000/form?beta
o Search Method: Multi-Engine
o Data Base System: That of search engines employed
o Graphics: Low
o Operators: Common Operators applicable. The "Help Page" provides
useful information for choosing options and displaying the results.
o Other Comments: Utilizes several of the major search engines.

12. YAHOO

o Home Page Address: www.yahoo.com
o Help Page Address: www.yahoo.com/docs/info/help.html
o FAQ: www.yahoo.com/docs/info/faq.html
o Search Methods: Subject and keyword
o Data Base System: Limited coverage. Indexed by people.
o Graphics: Medium
o Operators: In keyword searches, selects only sites that contain all
search words. If no exact match is found switches automatically to
AltaVista and a looser search process.
o Special Features: Can search by title [t] and URL [u].
o Other Comments: Yahoo! search is very good for the beginner. It is
easy to use and the headings and links are well organized.

GLOSSARY OF WEB SEARCH TERMS

This glossary contains terms used both in this work and other articles
applicable to searching the WWW. For ease of use by the beginner, the
definitions are brief and in simple language.

Boolean Search A keyword search that uses Boolean Operators for obtaining a
precise definition of a query. [See "Operators Used In Keyword Searches" in
Section B]

Browsing A Directory Search, which is a method of searching the Web by
subject through linked documents. In popular use, browsing is accessing
information from the Internet.

Browser A program used to connect to sites on the World Wide Web. More
generally, a program that accesses information on the Internet. Examples of
WWW browsers are Netscape Navigator and Microsoft Explorer.

Concept Search A query that implies a term's broader meaning, and not its
literal meaning.

Database Stored information about a topic or subject organized for
retrieval. A search engine database is kept current by means of an
automated search engine procedure called a robot or by author- supplied
information.

Directory Search A hierarchical search that proceeds through increasingly
more specific headings or sub-topics.

False Drops Documents that are retrieved but are not relevant to the user's
interest.

Full-Text Indexing An indexing method where every word in the Web page is
put into the database with the exception of prepositions, conjuctions, and
the like.

Hierarchical A ranking of subjects from the most general to the most
specific.

Hits Documents or references to documents that are returned in response to
a query, also called matches or matching queries.

Hypertext Link A highlighted word or image [shown in color] on a Web page
that when clicked connects or links to another location with related
information. [Links provide an easy way to move about the Internet]

Internet A worldwide collection of computers and computer networks that can
communicate with each other. The internet functions through Clients and
Servers. Clients are used to access and obtain information from databases.
Examples include on-line providers such as AOL and Compuserve. Servers are
used to provide information; examples are search tools and electronic mail
services.

Keyword Search A search that utilizes terms that define the user's
interest.

Link In WWW paralance refers to a hypertext link.

Location Box A designated place within a browser for an address [URL] . It
is the starting point for accessing a Web site.

Multi-Engine Search A search that uses several search engines in parallel
to provide a single response to a query.

Operator A rule or specific instruction on an aspect of composing a query
used to define the information sought.

Phrase Search One that states the words exactly as they are to be searched.
[A phrase is a string of words that are adjacent and related.]

Precision A standard measure of information retrieval. It is defined as the
number of relevant documents obtained divided by the total number of
documents retrieved.

Proximity How closely words appear together within a document. "Adjacency"
or "phrase" usually means that words must appear exactly in the order
specified with no intervening words. "Near" usually means that the words
must appear within a certain number of words of each other, although exact
word order is not specified.

Query A search request. A combination of words and symbols that defines the
information that the user is seeking. [Queries are used to direct the
search tool to appropriate databases.]

Query By Example Use of an example to solicit more like information.

Ranking A means of listing hits in the order of their relevancy. It is
usually determined by how well the reference matches the query and by the
number of occurrences of the term in the document being searched

Relevance The usefulness of a response to a query.

Robot The software for adding or updating databases by scanning documents
via a network of links. [A robot is also known as a spider, crawler and
indexer].

Search Box The place within a search engine's home page to enter a query.

Search Engine A computer program that locates information through the use
of keywords. The search engine usually resides in a host computer and
provides information service to other computers on request.

Search Tool The software which conducts a search by one of several methods,
namely Directory, Search Engine, Directory/Search Engine and Multi-Engine.

Site A location on the Internet. In WWW, it is called a Web site and
identified by its URL.

Spider See robot

Stemming The use of a stem [i.e. root] of a word to search words that are
derived from it. For example, "child" would retrieve information on child,
children, childhood, childless etc..

Term A single word or combination of words used in a query.

Truncation See Stemming

Uniform Resource Locator [URL] Uniform Resource Locator is a unique address
on the World Wide Web.

Web Server is a computer program that accepts requests for information,
processes the requests, and provides files accordingly.

Web Site A specific address or URL in a computer network.

John S. Walker
Publisher, CSS Internet News (tm)
(Internet Training and Research)
PO Box 57247, Jackson Stn.,
Hamilton, Ontario, Canada, L8P 4X1
Email: jwalker@networx.on.ca

© 1999--All rights reserved

http://members.tripod.com/kccesl/search.html
Posted to the Web on September 22, 1999