Earlier today, Dixon Jones from Majestic shared on Twitter[1] a thorough, digestible explanation of how PageRank actually works.
I gave it a watch myself, and thought it was a good moment to revisit this wild piece of math that has made quite a dent on the world over the past 20 years.
As a sidenote, we know as of 2017[2] that while PageRank was removed from the Toolbar in 2016, it still forms an important part of the overall ranking algorithm, and thus is worthwhile to understand.
Jones starts with the simple[3] — or at least, straightforward — formula.
For those who don’t adore math, or who may have forgotten a few technical terms since the last calculus class, this formula would be read aloud like this:
“The PageRank of a page in this iteration equals 1 minus a damping factor, plus, for every link into the page (except for links to itself), add the page rank of that page divided by the number of outbound links on the page and reduced by the damping factor.”
Back to the original Google paper
At this point, Jones moves forward in the video to a simpler, still useful version of the calculation. He pulls out excel, an easy 5 node visual, and maps out the ranking algorithm over 15 iterations. Great stuff.
Personally, I wanted a bit more of the math, so I went back and read the full-length version of “The Anatomy of a Large-Scale Hypertextual Web Search Engine[4]” (a natural first step). This was the paper written by Larry Page and Sergey Brin in 1997. Aka the paper in which they presented Google, published in the Stanford Computer Science Department. (Yes, it