This has been making the rounds today at other forums and has yet to make it here. So here it is...
Information retrieval based on historical data
I spent the day reading it and found it most interesting.
This has been making the rounds today at other forums and has yet to make it here. So here it is...
Information retrieval based on historical data
I spent the day reading it and found it most interesting.
I don't believe I've read that one yet. Thanks for the heads up.![]()
The Patent was filed Dec, 2003 but released today.
So you are not to far behind on your reading yet.
Awesome article...thanks!
I'm sure it is a great article. As I began reading it, then skimming, then just skipping around and reading bits and pieces....
...I figured I should simply wait until it was translated from engineerese into English or Arabic or some other easily understood language (humor).
I think we have to recognize that it's not an article written to be understood.
It is a Patent Application written by an experienced Intellectual Property Attorney (or several attys) meant to obfuscate.
![]()
A lot of talk about dates. The Anchor Text section below that is worth a read too.[0077] The dates that links appear can also be used to detect "spam," where owners of documents or their colleagues create links to their own document for the purpose of boosting the score assigned by a search engine. A typical, "legitimate" document attracts back links slowly. A large spike in the quantity of back links may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging links, purchasing links, or gaining links from documents without editorial discretion on making links. Examples of documents that give links without editorial discretion include guest books, referrer logs, and "free for all" pages that let anyone add a link to a document.
[0078] According to a further implementation, the analysis may depend on the date that links disappear. The disappearance of many links can mean that the document to which these links point is stale (e.g., no longer being updated or has been superseded by another document). For example, search engine 125 may monitor the date at which one or more links to a document disappear, the number of links that disappear in a given window of time, or some other time-varying decrease in the number of links (or links/updates to the documents containing such links) to a document to identify documents that may be considered stale. Once a document has been determined to be stale, the links contained in that document may be discounted or ignored by search engine 125 when determining scores for documents pointed to by the links.
[0079] According to another implementation, the analysis may depend, not only on the age of the links to a document, but also on the dynamic-ness of the links. As such, search engine 125 may weight documents that have a different featured link each day, despite having a very fresh link, differently (e.g., lower) than documents that are consistently updated and consistently link to a given target document. In one exemplary implementation, search engine 125 may generate a score for a document based on the scores of the documents with links to the document for all versions of the documents within a window of time. Another version of this may factor a discount/decay into the integration based on the major update times of the document.
[0080] In summary, search engine 125 may generate (or alter) a score associated with a document based, at least in part, on one or more link-based factors.
No wonder I couldn't read it (thanks).
But that excerpt makes a lot of sense. Some months ago I created a new page and I wanted it to do well in Google. So I immediately linked practically all my pages to it in a way I don't normally link to a page. For like two to five days, it was number one. Then it went back into the pack. That's when I realized Google was looking at how quickly a page accumulates links.
Or at least "thought" that was what was happening.
Here is a synopsis of the patent by Randfish
Google's Patent: Information Retrieval Based on Historical Data
I am just starting to read it so I can't agree or disagree with any of his assumptions yet.
© Copyright Real Estate Webmasters 2004-2010, All Rights Reserved. Terms of Service