Re: Looking for feedback...
First off, where are you seeing 30 million links? Did I miss something?
Next - until Google tells us otherwise I can guarantee that pagerank is evaluated based on the link of one site to another.
PageRank Explained
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don't match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content (and the content of the pages linking to it) to determine if it's a good match for your query.
Source: http://www.google.com/technology/
Google IMO looks at a link in 2 ways (Anchor text vote, pagerank contribution) the anchor text is a simple cemantic comparison of the text of the linking site to the text of the linked site and then the anchor that connects them. The closer they are related the stronger influence (The more anchor points) awarded which helps in serps (This is why we always promote "relevant" vs "irrelevant" link exchange. I also believe that if 2 many completely unlrelated relationships are formed (Especially reciprocally) there will be a negetive awarded to the site.
Now as for PR, to my understanding it is still based off of the orinigal PR = (1-d) + d(PRt1/Ct1) whereby D is a # we dont know called the dampening factor (For the calculation it doesnt matter) and PRt1 is the PR (Which google knows but we dont) of the site linking out and Ct1 is the amount of links on the page linking out.
The reason D doesnt matter is as follows
Pr of an individual page in a system = 1
Pr must diminish if 2 pages are linked together otherwise with each iteration it would increase forever therefor the dampening factor must be some # less than 1 to work in the calculation.
Here is the math
If a page has no outbound links PRt1 and Ct1 = 1 therefore you can allocate any # as D (as long as it is below one and the algo will = 1
Observer
d= .85 in (1-d) + d(PRt1/Ct1)
(1-.85) + .85(1/1) = 1
and if d= .5 in (1-d) + d(PRt1/Ct1)
(1-.5)+.5(1/1) it also = 1 and so on and so on.
Now what happens when you add a couple of links, lets increase the outbound links to 1 Prt1=1 Ct1=2 (Ct1 = 2 because the page itself counts as the first link and each subsequent outbound ads one more integer we will keep .5 as d)
(1-.5)=.5
.5(1/2) = .25
add them together = .75
Thus a page with a weight of 1 (all pages start with a weight of one) can pass on .75 or 75% of its weight to the page it links to. Assuming that page had not been linked or was not linked to from anywhere else, it also started with a weight of 1 thus it now has 1.75 weight points (Pagerank points) Now imagine it links out to just one other previously unlinked page (Weight of 1) it can pass 75% of its weight (Assuming there is only one link) to that page, so it is .75(1.75) = 1.3125 so now the 3rd page in the system is at a weight of 2.3125
This is how pagerank grows in a linear system
Now the reason for the dampening factor
What if you go back to your first 2 pages, link page A to page B
We already know page A has a weight of 1, and can also contribute .75 to page B thus with the one link page B becomes 1.75 but what happens if page B links back to page A? All of a sudden page A is recieving 75% of 1.75 back to itself and becomes 2.2135 - But wait isnt page A linking to page B? So it's now sending 75% of 2.2135 to page B which has 1.75 so again it grows (This is all due to the fact that pagerank allows pages to pass a % of their weight to another page via a link while still keeping all the pagerank it has. So if there were no dampening factor the #'s would just grow and grow into infinity. With our d=.5 this means that the amount of pagerank in the system that can pass is diminished by 50% each time (With the exception of the initial one point which becomes obsolete in an iteration loop) Thus for those of you familiar with Limits in Calculus, as the amount of iterations approaches infinity x=0 (X being the amount of pagerank that a page can pass)
The amount of iterations this takes depends on what you set d as, and d is one of the factors that I believe google plays with when messing with the importance of pagerank (Want to make pagerank important in the algo? Set d to a high # want to make it less important set d to a low #
What does all this mathematical rambling which might only make sense to me mean? It means that I think pagerank is 100% dependant on links.
The End
Morgan
Starting LEC 7 soon but it won't be called LEC 7 - LEC 2012 coming soon!