Google

Wednesday, July 29, 2009

How Search Engines Work

The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Search engines are designed to quickly and efficiently search and retrieve stored information. Once the search engine has retrieved the information, they then rank the results in order of importance, using their own ranking criteria.

Most search engines consist of three major components. They are:

  • The Spider
  • The Index
  • The Ranking System

The Spider
Search engines send out electronic robots (bots for short) sometimes called spiders, to relentlessly search the Internet, gathering information. The spider is the lifeblood of the search engine.

When a search engine’s spider finds a new website, it crawls all over the site and records the information on that website’s pages and returns this information to the search engine. This process is referred to as spidering or crawling a website.

You can also ask the search engine to spider (list) your site by submitting your domain name to them. If you have a website and you have access to your logs (statistics of who visits your website) you can actually track the constant movement of spiders through your site. They will show up in your logs with names like googlebot (a Google spider) or Scooter (Altavista).

The Index
Everything the spider finds, goes into the second part of the search engine, the index or catalogue. This is simply a very large database, accessible through the internet. Everything the spider finds is stored in the search engine’s index.

The Ranking System or Algorithm
The final and trickiest part of the search engine, is the system they use to rank their listings. This is done by a complex series of formulae know as an "algorithm". This is the formula the search engine uses to rank webpages in order of relevance (as they judge it) to a specific search.

Each search engine has its own algorithm for indexing and ranking webpages, which explains why you can do the exact same search on two different search engines and get completely different results.

Exactly how the algorithm work, is always a closely guarded secret of the search engine concerned however, most of them tend to follow a similar pattern, however, some are more complex and refined than others.

There may be questions regarding how differently each search engine functions. However, one basic thing that you need to know is that there are differences in the manner various search engines function but at the end of the day they all perform three fundamental tasks:

1. Searching the Internet based on the search query entered by the user.

2. Maintaining an index of all the information that gathers from various pages.

3. Permit users to search for words available in that index.