January 2008 Archive
So after learning yesterday that the Russian search engine Yandex (Яндекс) cached pages, I started looking at a few other well-known, non-English search engines. Baidu, the Chinese search engine has just expanded into Japan with a Japanese Baidu. Both of these also have cached copies of pages. At Baidu, look for the
百度快照 links after the URL (similar to Google's placement). For the Japanese version, the cache linked is
キャッシュ in a similar location.
For us non-Chinese and non-Japanese speakers, is there any use in these? Well, they are one more source for archived versions of pages, including English-language ones. For example, a search on library of congress (in English) finds hits at both. Here is a screenshot the Chinese version with the cache link in gray at the end.
And here's the Japanese version, with the cached link again in gray at the end. I'll be adding both of these to my Finding Old Web Pages page.
I do not usually spend much time with country-specific search engines, especially those in languages I do not speak. Even with English-language country-specific search engines, the general search engines usually have more comprehensive results and better search functionality. So when Phil posted about the Russian search engine Yandex (Яндекс), I just thought I'd take a quick look. Something piqued my curiosity, and I tried a few of the links. Sure enough, Yandex caches copies of many of the pages that it indexes. Look for the
Сохраненная копия link at the bottom left of a search result record as in the screen shot below.
Yandex's cache does not include a date, at least that I could identify, but from a few tests, it seems that the cached page may be quite recent (the day before) to several months old. I've added Yandex to my Finding Old Web Pages page.
TechCrunch discovered that for some Yahoo! searches, they have added in links to (Yahoo!-owned) Del.icio.us bookmarks. Yahoo! does not use the Del.icio.us name. Instead, the Del.icio.us logo is followed by number and "people bookmarked this page under" whatever tags they used. See the two lines outlined in red in the screen shot.
While not all Yahoo! results have been bookmarked in Del.icio.us, this is a great combination of information from the two services. Click on the number to see what comments and notes people have written about the site when they bookmarked it. While I wish the comments would appear when I mouse over the link, it is still an incredibly easy and quick way to see what others have said about a Web page before you visit it. So how do you get Yahoo! to show the Del.icio.us information?
At SearchEngineLand, Barry noticed that Google is no longer alerting searchers that stop words are not searched. Previously, stop words in a query that was not in phrase marks would usually find Google prompting searchers that the stop word in the query is "a very common word and was not included in your search." Does this mean that Google no longer has any stop words? Based on a few of my tests with a small retrieval set, comparing a search with a stop word and another search with a + in front of the stop word, it does seem that Google will on occassion still ignore some stop words.
Forgot to mention my Newsbreak article from last week, AskEraser: Privacy Potential. In working on that article and looking at AskEraser, one issue occurred to me that I did not cover in the piece itself.
I received email from Ixquick about AskEraser even before I saw anything from Ask. press release. In that email, Ixquick claims that
OK, Ask uses ads from Google. To display context-sensitive ads, related to the actual query, they have to send at least the query string to the Google ads server. While I could not get a definitive answer from Ask about what specific data elements Google sees, given Ask's commitment to privacy with AskEraser it seems likely that not much more than the query is sent. Even so, I was curious to see how Ixquick funds its service, and sure enough, it uses Google AdWords as well (under Sponsored Listings). So how is their privacy better than AskEraser?
While neither Ask nor Ixquick gives complete privacy, nor do they claim to. I credit both of them with raising the search privacy issue and providing at least some tools for helping protect searcher privacy. Since the vast majority of searchers pay no attention to such issues, I'd rather see all the search engines providing better privacy options rather than just criticizing their competitor's attempts.
Dean posted a scathing review of Google Scholar's performance over that past year based on a 32% decline in unique visitors according to ComScore data. More data on the changes at various Google properties between Nov. 2006 and Nov. 2007 are available in a TechCrunch posting. While I am sure that this data does not fully reflect actual Google traffic (and at least one comment on Battelle's Searchblog post says "a staff member from Google . . . tells me that ComScore has some of their numbers wrong"), I still find it fascinating. To no one's surprise, Web search is by far the busiest Google property. Google Directory traffic went down, which is not surprising since Google has made it so much harder to find. But the huge declines in Product Search (down 73%), Scholar (down 32%), and the Video Search (12% decline) surprised me. Book Search on the other hand has grown significantly in visitors (up 55%).
The chart showing which Google properties get the most visits is interesting as well. Web and Image search dominate and are both growing. After those two comes Gmail and Google Maps, which both rank higher in visitors than Google News. Given its increased prominence on the Google News page, I was also surprised to see how few visitors ever went to Blog search.
For Google users who visit many of their services, this is a telling lesson about how others use or do not use so many of Google's search services and applications. I also agree with Dean that Google Scholar's drop in visitors (if that is indeed accurate) comes in part from their failure to improve the service. I have found general Web searches often more effective than Google Scholar searches for at least some scholarly documents.