Finding it On the Net

a Carolina Global School House workshop

Web Directories
Definition: A hierarchically organized database of Internet resources. Specific sources can be found by browsing a logical arrangement of subject, topics, and subtopics.

 

Advantages: The final page of Internet resources that a web directory session produces will be concentrated in terms of a relevant hits.
Search Engines
Definition: A search tool that can perform searches on an index that automatically grows as a result of spider or crawler programs.
Advantages: The search engine represents a very large portion of the Internet and include a vast variety of Internet resources.

 

Yahoo
http://www.yahoo.com

Yahooligans
http://www.yahooligans.com

CCRN NET
http://www.ccrnnet.com/

VR-USA
http://www.vr-usa.com/

E-Map
http://www.e-map.com

Alta Vista
http://www.altavista.digital.com

Excite
http://www.excite.com

Hotbot
http://www.HotBot.com

HumanSearch
http://www.humansearch.com/

InfoSeek
http://www.infoseek.com

Lycos
http://www.lycos.com

News Hunt
http://www.newshunt.com

Open Text
http://www.opentext.com

WebCrawler
http://www.webcrawler.com

 

Some Web Search Resources

An extensive list of web directories: http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Web_Directories/
An extensive list of search engines: http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Search_Engines/
Comparison Chart: http://www.kcpl.lib.mo.us/search/chart.htm
Reviews of prominent search engines: http://www.kcpl.lib.mo.us/search/srchengines.htm
Search Engine Watch, service that keeps track of Internet search engines: http://www.searchenginewatch.com/

 

Search Engine Glossary

Boolean search A search where connectors such as AND, NOT and OR are used to include and exclude pages based on certain words.
Full-text index A search engine index that includes every word in the web pages that it represents.
Index The searchable database of records, each representing Internet documents. The database is created by search engine programs called spiders or crawler. Index is often used as a synonym for search engine.
Keyword search A search of an index based for words specified by the user.
Phrase search A search for documents containing an exact sentence or phrase specified by a user. Phrases are usually enclosed in quotation marks.
Search Engine An Internet search tool that consists of a web page, a program for search an index, the index (database) which represents a sizable part of the Internet, and spider or crawler programs that find new resources and adds them to the index.
Spider The software that scans documents and adds them to an index by following links. Spiders programs are sometimes called crawlers..
Stemming The ability for a search to include the "stem" of words. For example, stemming allows a user to enter "swimming" and get back results also for the stem word "swim."

 

 

Speaking BOOLEAN

The Boolean search language is named after George Boole, a 19th century mathematician. Boolean searching is a way of defining to search engines and other search tools, the information that you seek. It provides a way of including relevant topics and excluding irrelevant topics, filtering out unwanted information.

Topic Description Example
AND "AND" is a connector. It requires that all hits (web pages that fullfil the search criteria) must have both keywords connected by the "AND". This connector tends to decrease the number of hits. Bulls AND Basketball

Both "Bulls" and "Basketball"
will be present in selected web
pages.

OR "OR" is also a connector. It requires that either of the connected keywords be present in the hits...or both. This connector tends to increase the number of hits. earthquake OR seismology

This search will select web pages
that have either "earthquake" or
"seismology", or both
terms.

NOT "NOT" preceeds keywords that must not appear in selected web pages. This connector tends to decrease the number of hits. earthquake NOT engineer

This search will select web
pages that include the word
"earthquake" but do not include
the word "engineer". This serves
to filter out web pages related
to engineering buildings to prevent
earthquake damage.

Quotes ("") Quotes are used to define phrases. Keywords that are enclosed in quotes will searched as they appear in the quotes. This technique is good for finding specific titles of documents, such as "The Declaration of Independence" "William Shakespeare"

This will search for pages with
the words "William" and
"Shakespeare" with the "W"
and "S" capitalized and the
remaining letters in lower case.

Parentheses The parentheses defines the order for keywords. Relationships or phrases enclosed within parentheses will be tested before relations outside of parentheses. Think, algebra! (earthquake AND recent) NOT
(engineer OR candy)

This search will first find pages with
both "earthquake" and "recent",
and
also pages with "engineer" or
"candy". This is because they are
in parentheses. Then it will
follow the NOT connector
to filter out all pages with
"engineer" or "candy".

 

 

 

SEARCH

A Process for searching the Internet

 

S

tart with a small database search tool -- Yahoo

Conduct this search to get a grasp on the types of resources that a available on the Internet. Yahoo will give you a very good representation. This gives you a chance to identify both the good and bad hits you will be getting and a chance to identify the keywords that are common to both the good and bad hits, both equally valuable.

E

dit search phrase as a result

Create a boolean search expression from the common words discovered in your Yahoo search. Ex: (Earthquake OR seismology) NOT (engineer OR prediction OR individual)

A

dvance into a large database search engine -- Altavista or Excite

Use your edited search expression on a large database such as Alta Vista or Excite.

R

efine search phrase

Again, examine the good and bad hits, identify more key words and incorporte them in your search expression...refine your search phrase.

C

ycle back and Advance again

Pretty clear!

H

arvest resources

Good Luck!