Searching the Internet Part II

November 11, 1998


The Language of Search Engines

Search engines are your helpers. They are information assistants who aid you in finding the information that you need to solve a problem, answer a question, or make a decision. Like any other assistant, the degree to which they are able to help depends on the degree to which you are able to tell them what you want. Therefore, communicating with your search engine is a critical part of the search process.

Search engines need to know what information you seek, and they need this information communicated in a logical way -- they are, after all, computers. The language that we traditionally use to talk with computer-based searching tools is called "Boolean," named after George Boole, a mathematician of the 19th century.

In Boolean we use "keywords" to describe what words to look for when searching the search engines index or database. We also use "operators" to describe the relationships between our "keywords" and the relationships between our "keywords" and the information that we need. The basic "operators" are AND, OR, and NOT.

Let's use an example to explore how we would use Boolean to search for information on the Internet. We will look for information about Native Americans in the state of Ohio. In the descriptions below, we will explore several concepts involved in speaking Boolean and relate these concepts to our search.

 

Concept Explanation/Example

Keyword

A keyword is a word or term that we want the search engine to consider in looking for relevant information. In our example one word that would likely appear in a web page about Native Americans is "Indian"

Example: Indian

 

OR

In many cases, there may be a synonym of our keyword that might appear in the web page instead of the keyword we have already chosen. So we will want to expand the number of pages that the search engine sends us to include the ones using the synonym. In the case of our example, many web pages would likely use the term "Native American," which is more commonly used today than "Indian." In this case we would use the operator, OR, to say that we want web pages with either the word "Indian" or the term "Native American."

Example: Indian OR Native American

 

AND

Since we are looking for inforation about Native Americans in the state of Ohio, then an additional keyword will be "Ohio." We want to narrow the web pages that we get to only those about Native Americans in Ohio, so we will say that both terms must be present. Here is where we will use AND.

Example: Indian OR Native American AND Ohio

 

NOT

As we think through the information that we are likely to receive, we realize that there is a baseball team in Cleveland, Ohio called the Indians. We will want to filter out all web pages about the baseball team. So we will add a new keyword, "baseball", and connect it to our search expression with the operator, NOT. We are saying that the acceptable web pages should NOT include the keyword "baseball."

Example: Indian OR native American AND Ohio NOT baseball

 

Quotes

Just as we use commas, question marks, and other punctuations to help communicate with people, we use special symbols to clarify what we want from the search engine. One example is the use of quotation marks to define phrases. In our example, Native American is going to look like two separate words to the search engine that could each appear any place in the web page. To communicate that these two words belong together as a distinct phrase, we use quotes.

Example: Indian OR "Native American" AND Ohio NOT baseball

 

Parentheses

Each operator in a search expression defines a distinct "keyword concept."

Keyword 1 AND Keyword 2

Keyword 3 OR Keyword 4

Keyword 5 NOT Keyword 6

A keyword concept can consist of:

A single keyword or phrase

Two single keywords or phrases connected by an operator

Keyword concepts connected by an operator to other keyword concepts or single keywords or phrases.

Individual keyword concepts are marked by enclosing them in parentheses. In our example, the following are distinct keyword concepts:

Indian

(Indian OR "Native American")

((Indian OR "Native American") AND Ohio)

The final keyword concept, the one that includes all constituent keyword concepts is called our search expression.

Example: ((Indian OR "Native American") AND Ohio) NOT baseball

 

Admittedly, Boolean Logic is not the simplest thing to understand or to teach. However, it is a very effective way of communicating your information needs to search engines.

To make things easier for casual users, Internet search engines have developed alternatives to traditional Boolean Logic. One of the most common conventions is the use of pluses (+) and minuses (-), to indicate which terms must (+) and must not (-) be present in the returned documents. Each search engine has developed its own version of these searching conventions, each trying to improve upon these standards, and this evolution of the search language continues. None is perfect and you will find that finding information from the Internet is more a process than the click of a button. It is like being a detective.


Copyright © 1998 by David Warlick
All rights reserved