Happy birthday Google

11th_birthday

Brin and Page:

We assume page A has pages T1, . . ., Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85, . . ., C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))

Google’s search algorithm is a refinement of degree centrality, namely eigenvector centrality, where degree centrality is something we touched upon in the last section when discussing testimony, trust and authority. Eigenvector centrality gives credence to the idea that not all connections should be equally weighted. Google’s Page-Rank (PR) is a star example of eigenvector centrality, and is a direct descendant of the citation system used in traditional librarian science, the most familiar being journal rankings. The more citations other documents make to a particular document, the more ‘‘important’’ a given document is and the more status accorded to a journal through aggregation techniques. In much the same way, Google’s PR algorithm assesses the importance or relevance of a Web page. Search engines are, as Christophe Heintz puts it, ‘‘reputation systems’’ (EigenTrust algorithms) in that they ostensibly promote epistemic and cognitive worth (Heintz, 2006). PR’s power lies in its ability to solve an equation with over 500 million variables and 2 billion terms. Its simplicity lies in its assessing a page’s importance by counting backlinks as a traditional technique of library science objectivity.

Brin, S. & Page, L. (1997). The anatomy of a large-scale hypertextual web search engine.

Heintz, C. (2006). Web search engines and distributed assessment systems. In S. Harnad, I. Dror (Eds.), Distributed cognition: Special issue of pragmatics & cognition (Vol. 14)(2), (pp. 287–409).

Share this:

Related