Inconsistencies Category Archive
At Search Engine Land, Danny has a long report about Google indexing and ranking issues. While other sections of the post talk about an update to the visible PageRank, issues with supplemental results, and duplicate content, I found the short section on the
filetype: command most interesting. Like some of Google's other field search prefix commands,
filetype: results in zero records unless it is combined with another search term. So
filetype:xls finds nothing, but this is supposed to change sometime in the future and will finally let us run a
filetype:search without requiring an additional term. Does this mean that other field searches will be able to be run separately as well? We'll have to wait and see. In the meantime, if you'd like to get all the results Google will give you for some unusual file type, there is an easy way around the additional term requirement.
OK, here's an unusual Google search result. With my preferences set to display 100 results at a time, I ran a search
powells books to see if it would use the plus box. Google only displayed the first four of about 962,000 (a wildly inaccurate number, but certainly there should be more than 4). So what happened? I changed the number to be displayed to ten, and Google gave ten results. When I switched back to display 100, I more than quadrupled my retrieval with 18! Switching from Firefox 2 to IE 7 to avoid cookie effects, I still just got 18.
Here's an interest report from John Battelle about Amazon not showing up as result on Google when
amazon.com is the query. Danny adds some comments along with a report for the same event happening with Digg.
Here's another report of Google problems with the
site: command and a partial fix. The problems have been with host names containing punctuation like your-university.edu and with including a trailing slash after the domain name as in
site:university.edu/. The first has reportedly been fixed, but the trailing slashes are still a problem.
A (very) long discussion thread at WebMasterWorld, Pages Dropping Out of Big Daddy Index, explains an inconsistency with how the
site: search at Google works at the moment. For some background, Google has a large database of "supplemental results" which typically only show up in search results when the total number of results is less than some Google-only-knows number. These supplemental results (tagged as such after the file size) are updated much less frequently and are often duplicates or dead links. However, sites with large numbers of pages find that some of those pages end up in Google's supplemental database. This discussion thread shows discusses how the
site: search is failing to bring up some results from the supplemental index even though the pages might be found by a keyword search.
Gary Stock, founder of Google Whacking has posted information about recent strange problems at Google. He has dubbed these GoogleNACK (as in Negative ACKnowledgements) and offers detailed examples. Seth Finkelstein postulates that the malfunction is related to Google's spam defenses. As of today, some of these searches are fixed, but others like
keyboard bracelet and
motorcycle candle fail with "Results 1 - 19 of about 48,600" and "Results 1 - 69 of about 64,000" respectively.
In another unrelated (I assume) peculiarity, a search on Google for pages only on Google's own Web site (using site:www.google.com) and searching for the word "google" finds several results that are on completely different hosts. Reported on Slashdot on Oct. 6, the inconsistent results continue. As of Oct. 11, a search on
site:www.google.com google should only find pages at Google. Yet with the number of hits set to 100, some records come up from adobe.com, digits.com, osdn.com, and even washington.edu.
Google Inconsistencies page has been updated with these problems.
Back in May, Google's intitle: and inurl: were not working properly, as I posted earlier. Well, they now seem to be working again. A search that combines a general query term with these field searches, like "market research" intitle:tourism, now work. I've updated my Google Inconsistencies page to note that problem has been fixed, but I added another report of a strange result for the simple query of 'cameras.'
For more than a month now, the intitle: and inurl: field searches have been broken. I first heard of this on May 27, 2003. The advantage of intitle: and inurl: over the advanced search page Occurrences section or the allintitle: and allinurl: field searches was that they applied to only a single term and could be combined with other search terms that would look through the record. So now, searchers can not do a search that looks for one word in the title and another in the body. A search that tries like "market research" intitle:tourism retrieves many results that do not include 'tourism' in the title.
At first I thought this was a temporary glitch from the strange May update, but it has persisted through the June update and has continued for some time. Hopefully it will be correct sometime soon. I've updated the Google Inconsistencies page with this problem and several others long term problems.
In addition, I updated several parts of the Google Review, including the addition of several language limits added in early 2002 that I had missed: Croatian, Indonesian, Serbian, Slovak, and Slovenian.