Search Engines and the Art of Linking

The anchor tag is fundamental to the web. People use it all the time and the web won't be the web without it. In recent years, the rise of Google has increased the importance of anchor text considerably. Unfortunately, most people remain blithely unaware of this development. Poorly worded links and outright linking blunders are making finding information less efficient.

Searching for 'Minesweeper' on Google returns approximately 195,000 documents. The page ranked number two in the results is a 404 page with no occurrence of Minesweeper in the contents of the page. The URL of that page used to point to Richard Kaye's Minesweeper is NP-Complete page, which has moved. Google still considers it highly relevant as many links point to it and contain the word Minesweeper in and around the anchor text. People should fix broken links but this is not the point of this example. The point here is the importance of the right keywords in the anchor text.

People don't realize that by putting up links they are not just achieving the obvious but are also acting as navigators. The links they put up help guide searches for search engine users. Anchor texts such as "this article", and "this link" are way too common. Google claims to take into account neighboring words near the link, but neighboring words are typically much less relevant than anchor text. Consequently, it is logical to assume that Google puts much less emphasis on neighboring words as compared to anchor text.

The popular news-site Slashdot has a tech-savvy audience. The news are generally submitted by the readers themselves. Anchor text analysis of Slashdot December 20, 2003 edition (arbitrarily chosen) reveals more than 50 percent of the anchor texts to be non-descriptive. This is worse than it sounds, the posted stories are selected by editors and represent the best submissions.

There is a very clear trend to be noticed on Slashdot. The best anchor texts are descriptive and fairly long, five or more words. The worst ones consist of a single word. Using "recently" as anchor text gets the job done but people typically don't search for "recently" on Google, and even if they did, they are unlikely to find what they are looking for.

Apart from increased search engine exposure, anchor text provides the motivation for visiting a link. Visiting links entails a time expense, as webpages take time to load. Moreover, visiting a link disrupts the focus of the reader. A tiny, hardly visible, one word link is often insufficient justification for incurring these costs.

Having the right keywords in the anchor text is not enough. The goal of anchor text should be to identify a particular page uniquely. Anchor text such as "Richard Kaye's Minesweeper is NP-Complete Page" is descriptive enough to locate the page if the link to it grows stale. The new location of the page or a cached/archived copy can be quickly found using a search engine. Poor choice of anchor text does not allow any such possibility.

Non-mainstream websites require special consideration. Such sites typically do not benefit much from linking to individual articles. The inbound links get divided up amongst different pages of the website with no significant improvement to the site's rank. The best practice in such cases is not just to link to the article in question but also to link to the parent website with a descriptive anchor text.

Before adding a link it is important to understand that linking always equals promoting. In all likelihood, additional links will bring additional traffic to the linked webpage. Individuals weaned on web-forums and Usenet do not appreciate this fact. Many are fond of writing angry refutations to articles they don't like and then linking to the article being refuted with juicy anchor text. This is free promotion for the original article (the article being refuted). The original article gets an extra link and consequently an improved rank in Google searches.

An intelligent response is a must when refuting information on the web. The tone of any refutation needs to be civilized and the arguments convincing. The author of the original article must be informed of the existence of the refutation. If the argument is good enough, many authors will add reciprocal links. In case no reciprocal links are added, it is best to use non-descriptive anchor text to link to the original article.

Link bombing is amusing at times, but some people are fond of using it to shoot themselves in the foot. Link bombing works by creating many links to a victim site with some irrelevant target keywords in the anchor text. When a user searches on Google for the target keywords, the victim site comes up. Link bombing is a bad practice as it makes searches less efficient. It can be a good promotion for a friend's site, but some people are using it to promote their foes. Currently, an ongoing link bombing attack is causing searches for "miserable failure" to return George W. Bush's biography.

Sometimes the intention of a link is to guide in a general direction and not point to any specific website. For example, pointing people to websites offering free web-hosting is one such scenario. Sites offering free web hosting frequently change policies, merge with other providers, or simply disappear. Any links pointing to them are likely to lose their context rather quickly. The best compromise in such situations is to link to a web-search. The web search in all likelihood will come up with up to date results and will keep pace with changes in the market place.

Linking is not a science, it is a skill and an art. People get better only with practice and experience. The best way to practice is by creating a personal website. More personal websites will mean more links and better searches. Having five links to an article instead of one does not create any additional content. However, as different people catalog things differently (use different anchor texts), it makes finding information easier.

People need to take responsibility for making the web work well. Instead of complaining about poor search engine results, people need to work proactively to enhance search engine relevancy. A search engine is always going to use heuristics to deliver what is out there. If there is too much noise and little guidance, no search-engine is going to be able to filter through that.

by Usman Latif  [Dec 25, 2003]


