PageRank
by Illya J. D'Addezio
http://www.webmasteroutpost.com/articles/pagerank_primer.html


Before you get carried away and place a down payment on that mansion on the hill you've always eyed, let state as clearly as possible: PAGERANK IS NOT THE HOLY GRAIL! And yet it can be used by search engines as a weapon to impart damage upon legitimate businesses. Recently, one of the earliest search engine providers, SearchKing, Inc., initiated legal proceedings against Google for its abuse of PageRank. (see PRESS RELEASE) Due to the high value associated with PageRank, Bob Massa, the owner of SearchKing, Inc, claims in his lawsuit that the purposeful reduction of SearchKing and its related web sites' rankings has damaged the company's reputation and diminished its value.

Just what is this PageRank thing that causes us so much grief? PageRank ("PR") is a numeric value that represents how important a page is on the web. Google figures that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page must be. The PR of a site, which ranges from one to 10 (10 being the highest), is displayed publicly on each site visited through the use of the Google tool bar, which can be down loaded to any computer for free. PageRank is one of the methods Google uses to determine a page’s relevance or importance -- It is only one part of the story when it comes to the Google listing.

Sergey Brin and Lawrence Page, Google’s founders, developed the PageRank concept and define the concept as follows:

We assume page A has pages T1...Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages' PageRanks will be one.

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web.

There are some helpful links at the end of this article if you want to understand in extreme detail how to determine (well, at least, guestimate) PageRank. Only Google knows how it is really done. And only Google has the power to truely control it's behavior.

While it's not the "holy grail" and you really cannot control it, understanding the basics of PageRank are important so you don't "give it away" or "waste it" carelessly. If you read the articles below you'll see that different interal site linking schemes will yield different PageRank values. As PageRank values are calculated, a give page, Page A, gives it's own PR away to the pages it links to. And in return receives PR from the pages that link back (hence the term "backlink"). While you can only try and influence how and where other webmasters link to yours, you are in total control of how you link to yourself and other sites.

Your home page is probably the page most other sites will link to, and the one you've probably done the most search engine optimization on. So make sure all of your own pages link to it. If you have a lot of pages, you should probably set up a layer of pages in between and have subordinate pages link to them as well. Here's an example.

Home Page - A
  1st Level - B
    2nd Level - C
    2nd Level - D
 

Or this make more sense...

/index.html - A
  /folder/index.html - B
    /folder/page1.html - C
    /folder/page2.html - D
 

Make sure B, C and D link to A, and C and D link to B. Optimize A for your general keywords, and each B for a specific related keyword. Use the keywords as the text in the links. Also, link A back to B, and B to C and D, to keep the PR circulating internally. If, for example, D also had an external link (i.e. a link to another web site), you should also link it to C (and any other pages in the folder) to minimize the "PageRank leakage" to the external site.

It's a good idea to keep the number of links to pages with external links to a minimum, and on pages with external links maximize the number of internal links. I often use the analogy to a hamster Habitrail -- those cages with the extra tubes you can connect in various configurations and the hamsters run around and around. The hamster is the PageRank and you want to keep it in the Habitrail -- so don't point all of the tubes towards the exit holes... Make sure the optimized pages on your site are well-linked to, and reduce links to the other pages. This gives the pages you are hoping fare well on the search engines an extra boost.

PageRank is calculated iteratively and starts at 1.00 on each page. On a site with no external links, the total pagerank for a site will be the sum of all the pages. Therefore, if you want to have a higher pagerank, develop more content rich pages and link them together effectively. After you carefully map out all of your pages, designate those with the fewest inbound links as possible pages for external links. Keep external links off of your higher PR pages.

The popular "recipricol links" page is a pagerank nightmare! And worst of all, it is always linked from the PR rich home page. Having a links page is unavoidable, just don't make the mistake of linking to it all over the place. Try to diffuse the loss by including lots of internal links, and consider having an intermediate page with the internal links (could be your site map).

/index.html - A
  /links_in.html - B
    /links_out.html - C
 

Page A links to B and B to C. Page C is going to link to dozens of external page so it's got major leakage. By putting page B in the middle and having it link to other pages on your site, you're reducing the amount of PR that makes its way to C -- therefore keeping the PR to yourself. Since position of of a link on a page makes no difference, you can keep webmasters happy by having the link to C quite prominent on page B.

You really can't cheat the pagerank system, but if you're not careful you can be defeated by it. Worry about the internal structure of your site, develop as much relevant content as you can, and then just invite other webmasters to recipricol link with you. DO NOT participate in content-less link farms, as you'll only end up losing PR. If someone offers to sell you some of their PR, research their page carefully. How many external links do they have? How are they channelling PR internally? And in the end, noone can guarantee PR, so be prepared as their PR could disappear just as quickly as your money did!


Illya D'Addezio has operated the Webmaster Outpost since 1999 and his marketing strategy has helped countless webmasters develop successful web businesses. © 2002 Software Wonders of NJ. Permission to reprint this article is freely granted provided that this author bio paragraph and copyright is included without modification.