Searching the Internet
Part II
|
November 11,
1998
|
|
The Language of Search Engines
|
Search
engines are your helpers. They are information
assistants who aid you in finding the information
that you need to solve a problem, answer a
question, or make a decision. Like any other
assistant, the degree to which they are able to
help depends on the degree to which you are able to
tell them what you want. Therefore, communicating
with your search engine is a critical part of the
search process.
Search
engines need to know what information you seek, and
they need this information communicated in a
logical way -- they are, after all, computers. The
language that we traditionally use to talk with
computer-based searching tools is called "Boolean,"
named after George Boole, a mathematician of the
19th century.
In Boolean
we use "keywords" to describe what words to look
for when searching the search engines index or
database. We also use "operators" to describe the
relationships between our "keywords" and the
relationships between our "keywords" and the
information that we need. The basic "operators" are
AND, OR, and NOT.
Let's use an
example to explore how we would use Boolean to
search for information on the Internet. We will
look for information about Native Americans in the
state of Ohio. In the descriptions below, we will
explore several concepts involved in speaking
Boolean and relate these concepts to our
search.
|
Concept Explanation/Example
|
Keyword
|
A
keyword is a word or term that we want the
search engine to consider in looking for
relevant information. In our example one
word that would likely appear in a web
page about Native Americans is
"Indian"
Example: Indian
|
OR
|
In
many cases, there may be a synonym of our
keyword that might appear in the web page
instead of the keyword we have already
chosen. So we will want to expand the
number of pages that the search engine
sends us to include the ones using the
synonym. In the case of our example, many
web pages would likely use the term
"Native American," which is more commonly
used today than "Indian." In this case we
would use the operator, OR, to say that we
want web pages with either the word
"Indian" or the term "Native
American."
Example: Indian OR
Native American
|
AND
|
Since we are looking
for inforation about Native Americans in
the state of Ohio, then an additional
keyword will be "Ohio." We want to narrow
the web pages that we get to only those
about Native Americans in Ohio, so we will
say that both terms must be present. Here
is where we will use AND.
Example: Indian OR
Native American AND Ohio
|
NOT
|
As
we think through the information that we
are likely to receive, we realize that
there is a baseball team in Cleveland,
Ohio called the Indians. We will want to
filter out all web pages about the
baseball team. So we will add a new
keyword, "baseball", and connect it to our
search expression with the operator, NOT.
We are saying that the acceptable web
pages should NOT include the keyword
"baseball."
Example: Indian OR
native American AND Ohio NOT
baseball
|
Quotes
|
Just as we use commas,
question marks, and other punctuations to
help communicate with people, we use
special symbols to clarify what we want
from the search engine. One example is the
use of quotation marks to define phrases.
In our example, Native American is going
to look like two separate words to the
search engine that could each appear any
place in the web page. To communicate that
these two words belong together as a
distinct phrase, we use quotes.
Example: Indian OR
"Native American" AND Ohio NOT
baseball
|
Parentheses
|
Each operator in a
search expression defines a distinct
"keyword concept."
Keyword 1 AND Keyword
2
Keyword 3 OR Keyword
4
Keyword 5 NOT Keyword
6
A
keyword concept can consist of:
A
single keyword or phrase
Two
single keywords or phrases connected by an
operator
Keyword concepts
connected by an operator to other keyword
concepts or single keywords or
phrases.
Individual keyword
concepts are marked by enclosing them in
parentheses. In our example, the following
are distinct keyword concepts:
Indian
(Indian OR "Native
American")
((Indian OR "Native
American") AND Ohio)
The
final keyword concept, the one that
includes all constituent keyword concepts
is called our search expression.
Example: ((Indian OR
"Native American") AND Ohio) NOT
baseball
|
Admittedly,
Boolean Logic is not the simplest thing to
understand or to teach. However, it is a very
effective way of communicating your information
needs to search engines.
To make
things easier for casual users, Internet search
engines have developed alternatives to traditional
Boolean Logic. One of the most common conventions
is the use of pluses (+) and minuses (-), to
indicate which terms must (+) and must not (-) be
present in the returned documents. Each search
engine has developed its own version of these
searching conventions, each trying to improve upon
these standards, and this evolution of the search
language continues. None is perfect and you will
find that finding information from the Internet is
more a process than the click of a button. It is
like being a detective.
|
|
|