+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 14

Thread: Google: Changes In Ranking Strategies

  1. #1
    Join Date
    Aug 2007
    Posts
    9

    Default Google: Changes In Ranking Strategies

    INTRODUCTION
    This paper is a followup to On the Googleness of Being, published at Spider-Food.Net on February 11, 2005.

    Around the middle of December 2004, I realized that Google had begun adding fresh Xenite content into their primary cache on a weekly basis. By submitting newly created content for crawling at different times through the week, I found that anything submitted by Friday would usually appear in the cache (and, by extension, in search results) by the next Monday. The pattern has held consistent through the middle of April 2005.

    In February, I wrote that Google displays a secondary cache which is built from their daily crawls and is only used for minimal reporting. I have noticed differences of only a few days between the primary and secondary caches for some of my pages. In fact, I believe that I was only seeing cache from different data centers[1]. However, subsequent observations have led me to conclude that Google is, in fact, maintaining a historical footprint of Web site caches. Around March 6, 2005, Google's search results began displaying far fewer instances of descriptions and cache for a broad variety of search queries. I estimate that of my queries, somewhere between 50% and 75% of the results lacked cache and description data.

    Within two weeks of this event, Google began displaying cache and title data for many of the uncached sites from 2004. Sampling the cache results, I found data from as far back as January 2004. It appears that Google was relying on shards[2] from a period extending back over a year to supply cache and title data. From about March 10 through April 10, I observed steady updates to the Google cache on a weekly basis. Old data was consistently replaced by new data extracted from recent crawls. During this period of time, several search engine optimization forums were used by people to report that their sites, which had not been crawled by Google for months, were being recrawled.

    In February, writing about what I perceived to be the two distinct caches in Google's search results, I said: "I believe Google is comparing the pages in primary and secondary cache, and if it finds a difference it reschedules a crawl for the site. The second crawl seems to be what kicks the content of secondary cache into primary cache." I now think the process is more sophisticated. Google maintains a footprint of reported changes in content. The footprint appears to be established by simple HTTP header requests. Any page which generates a 200 response code from the server is fetched. Any page which generates a 304 (NOT MODIFIED) code from the server is not fetched. The sampling of server response codes helps Google determine whether a site should be recrawled.

    By mid-March, Google had substantially reduced the number of full-page requests from Xenite.Org. The nature of Xenite's content is largely static, but once or twice a year I usually revise the basic appearance of the site. These revisions usually result in significant pages to on-site navigation, advertising, disclaimers, and cross-linking. The changes do not normally alter the body text of the content. By implementing a partial revision of page layouts across selected portions of Xenite.Org's network, as well as adding new content, I observed increased full-page fetching activity from Google. All the changed content was eventually recrawled and reindexed, although updates to the cache might lag by 9-10 days.

    New content, optimized for high placement in rankings with standard on-page factors[3], generally appeared in Google's search results within 5-10 days of being submitted for crawling. The rankings for targeted search terms usually began in the 30s or 40s. Within 2 weeks, a typical new page would break into the top 10 results for the targeted term. Within 3-4 weeks, many pages were ranked 1st or in the top 5. Competitiveness for search terms varied[4]. Many search expressions rated in the "Not Optimized" range of 1..10. The competitiveness of a search expression is not directly related to the level of traffic for that expression.

    For example, a search for "google" produces only results from Google's Web sites. There is no level of competitiveness because there is no competition in the top 10 listing. But Google receives millions of visitors each day.

    Recap of "On The Googleness of Being"
    In "On the Googleness of Being", I asserted that Google was replacing forwarding pages with on-page content indicating that a URL had been changed with the actual (dynamic) URLs of the moved content for Xenite's forums. In fact, we maintain duplicate sets of forwarding pages on both Xenite.Org and SF-FANDOM for historical reasons (there are many off-network inbound links for the Xenite pages which send traffic to our forums). The Xenite forwarding pages have been largely dropped from Google's search results, whereas prior to February 1, 2005, it was common for both the Xenite and SF-FANDOM forwarding pages to be listed in the top 10 results. Now, about 50% of SF-FANDOM's forwarding pages are listed and the rest of the search expressions produce direct links to the dynamic forum URLs. We maintain 1st through 5th place rankings for our forums, which is consistent with past performance.

    There have been several reports in search engine optimization forums that Google is now increasing its cache size per page. That is, the reported cache now exceeds the 101 Kilobyte limit which Google had previously imposed. The increased amount of cache per page and the use of cached data from the previus 14 months implied that Google has substantially increased its server resources.

    In "On the Googleness of Being" I reported that:
    "Google is influenced by smaller content pages than it is by larger content pages." My observations since early February have not been consistent with that statement. However, the de-evolution of the Google cache in March was probably a significant factor in the change in observed behavior.

    "Google has swung back to embracing random fresh content." In fact, the crawling behavior I have observed, where 304 NOT MODIFIED codes are returned by servers, explains this shift in results. Google is not "embracing random fresh content", it is actively seeking ALL fresh content.

    "Inbound links are not important for the new content on established sites, provided that those sites are internally well-linked." Continued observations of Google's changes in results where new content appears supports that conclusion.

    "Where Google detects redirection or supercession of content, it is bumping the new content up in the rankings at the expense of the older, redirecting pages ... WITHOUT REGARD FOR WHERE INBOUND LINKS ARE POINTING." This continues to be so in the search results I have monitored. However, this behavior does not appear to be related to the 301 redirect issues which have caused much concern in search engine optimization communities. The redirection referred to here is handled through Javascript and/or HTTP-EQUIV meta tags in page headers.

    I also introduced several terms to explain Google's apparent behavior. They are:
    REPUTATION, where Google appears to distinguish a site's importance on the basis of its past performance in Google's database. Performance may include ranking for multiple search queries. Performance may include obtaining a large number of inbound links. Performance may include obtaining inbound links from trusted sources (see point 2 below). Performance may include measurable growth in specific content (as opposed to growth through the addition of random topics).

    TRUSTED CONTENT SITE, where Google appears to handle changes and additions to the content of an older, well-established, large-content site better than changes and additions made to a smaller, younger site. Or, where Google appears to confer a status or reputation upon a site due to its top-level domain (in particular, sites with .EDU, .GOV, and .MIL top-level domains now seem to be treated as more than equals with other sites).

    LISTING INHERITANCE, where Google appears to transfer the search results positioning of one page to another page because the first (older) page is redirecting to the second (newer) page. The relative difference in page origination dates may be a factor. That is, an older page does not appear to replace a newer page.

    CHILD INHERITANCE, where Google appears to confer a measure of importance to a page newly added to a large content site. The child page may be ranked in search results in part according to criteria associated with its parent page or related pages (siblings) from the same site. A child page may therefore be deemed as important and valuable a resource as a parent page. Children of TRUSTED CONTENT SITES are most likely to inherit parent REPUTATION.

    TIMERANK, where Google appears to measure a site's value by accumulating timestamps or measurements of timestamps over a period of six to twelve months.

    With respect to REPUTATION, in "Googleness" I asked, "how much data can Google track for a Web site?" The question may have been answered in part by the subsequent release of a patent application titled Information Retrieval Based on Historical Data. The Abstract describes the methodology as:
    A system identifies a document and obtains one or more types of history data associated with the document. The system may generate a score for the document based, at least in part, on the one or more types of history data.
    _________________
    Garden fountain construction podium sale

  2. #2

    Default Re: Google: Changes In Ranking Strategies

    I nominate this for the longest, most random post EVER.

  3. #3
    Join Date
    Aug 2007
    Location
    Boone County, Illinois
    Posts
    311

    Default Re: Google: Changes In Ranking Strategies

    I 2nd.
    Jeff Hill
    RE/MAX Property Source
    6940 Villagreen View
    Rockford, IL 61107
    Direct 815-489-3401
    Cell 815-315-2626
    Rockford Real Estate
    Rockford and Boone County Home Values

  4. #4
    Join Date
    Jan 2007
    Location
    Morristown NJ
    Posts
    1,579

    Default Re: Google: Changes In Ranking Strategies

    I could not read the whole thing, Not enough sleep last night and I got lost in the boring history. I kept thinking, get to the point, get to the point, and then I finally gave up and just read what people were commenting on this.
    James Boyer
    RE/MAX Properties Unlimited
    Morristown, NJ 07960
    973-647-0253
    Serving the Real Estate markets of Morristown, Morris Township, Madison NJ Real Estate, Chatham NJ , Summit, Short Hills, Millburn, Maplewood, South Orange, & West Orange Referals happily given and accepted. For information on home sales in New Jersey please contact. Morristown NJ Real Estate Madison NJ Real Estate Chatham NJ Real Estate

  5. #5
    Join Date
    Aug 2007
    Location
    Metro Atlanta
    Posts
    370

    Default Re: Google: Changes In Ranking Strategies

    Need coffee, reading post putting me to sleep.

  6. #6
    Join Date
    Aug 2006
    Location
    Fort Worth texas
    Posts
    4,941

    Default Re: Google: Changes In Ranking Strategies

    This guy is a spammer
    Looking for Real Estate, Investments, Condos in Dallas-Fort Worth-Denton-Keller-Haslet-North Texas Area. We have you covered, 400 New Homes and 60,000 Pre-owned home. Mike Pannell 817-703-3238
    Dallas Texas Real Estate | Fort Worth Texas Real Estate | Dallas Real Estate

    And remember...Nu Home Source Realty Rebates 20% of earned commissions back to the buyer!

  7. #7
    Join Date
    Nov 2006
    Location
    South Carolina
    Posts
    467

    Default Re: Google: Changes In Ranking Strategies

    Wow, it hurts my eyes looking at it.

  8. #8
    Join Date
    Jun 2006
    Posts
    536

    Default Re: Google: Changes In Ranking Strategies

    More pontification !

  9. Default Re: Google: Changes In Ranking Strategies

    Quote Originally Posted by metaylor9 View Post
    I nominate this for the longest, most random post EVER.
    Quote Originally Posted by Northern_IL View Post
    I 2nd.
    I 3rd. All in favor of a banning, say "I"
    •:*¨¨*:• Richland Wa Homes | Kennewick Homes | Pasco Wa Homes •:*¨¨*:•

  10. #10
    Join Date
    Sep 2005
    Location
    D
    Posts
    1,177

    Default Re: Google: Changes In Ranking Strategies

    Obviously plagerized verbatim from someone else's work, and probably obsolete, by the 2005 references. Should have been linked to original source, if anything.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts