February 2003 Archive
Apparently, Google is moving more aggressively into the advertising business. They are starting a new text ad program, Google Content-Targeted Advertising, which will display ads on non-search related pages. This ad program, like those ads that Google now shows on its site, are not graphics or banner ads, but text ads with a colored background. They are free until March 12, but where they will be displayed after then remains to be seen. See also Google's FAQ for Content-Targeted Advertising.
Another week, another acquisition. Last week, Overture announced plans to buy AltaVista. This week it announces the planned acquisition of the Web Search Unit of Fast Search & Transfer which includes AlltheWeb, FAST Web Search, and the FAST PartnerSite paid inclusion program. Purchase price is $70 million plus a performance-based cash incentive payment of up to $30 million over three years. See also the FAST press release.
Bear in mind that, at least for now, that AlltheWeb and AltaVista continue to have their own, separate database and their own unique search features. How this all will change in the future remains to be seen.
The advertising search engine Overture announced plans to acquire AltaVista. It's another interesting acquisition, although there are scant details on the long-term future of AltaVista. At least Overture has the money to support AltaVista and may be able to maintain the AltaVista database and improve it. With Yahoo's purchase of Inktomi, one obvious change may be that Overture's follow-up search engine will switch from Inktomi to AltaVista. This could also change sites like Go.com that use the Overture/Inktomi combination. But we'll have to wait to see what changes will actually happen. Overture has said that they plan to maintain AltaVista as a destination site.
For reasons known only inside the company at this point, Google has bought Pyra Labs, maker of the free Weblog site and software company Blogger. Fittingly enough, instead of announcing it in a press release, the news first showed up in a blog. No immediate known changes as a result to either Google or Blogger, but it will be interesting to see what comes of this acquisition.
HotBot's version of Teoma has finally added some of the advanced search features that have been available on Teoma's site for the past few months. It does not include all of the advanced search features but does offer language, date, and region limits along with the offensive content filter that had been there before. In addition, Teoma's 'metasites' or 'resources' now also show up at HotBot right under the Teoma logo. Only one metasite is listed after 'Try Resources' but a link to 'More' will display up to ten. Also, if you set HotBot's results preferences to display related searches, they will now show up for Teoma as well.
Googlert and SearchAlert.net are two new free services that offer email alerts when new search engine results are available. Googlert was launched in January, but I'm not sure when SearchAlert.net started. Googler works only on Google and does require registration for a free Google API key. SearchAlert.net says that it "continually monitors the big Web search engines" but does not specify which ones. Alerts page updated with both of these.
Finding Google's cached copy is not always trouble free. Take the recent example of an interesting story of journalistic confusion gets even more confused. Apparently, a Computerworld reporter was fooled into believing that terrorists claimed responsibility for the recent "Slammer" worm. The original story was posted online but now states "Computerworld removed this story due to questions about its authenticity. An update about this situation has been posted."
So what does this have to do with Google's cache? Well, other reporters thought they might find the original story from Google's cache. Google Village, in their story Google Everflux Misses Slammer Terror states that "Google is good at getting the fresh stuff each day, but not good enough to capture a page, and cache it after such a page has appeared for a few hours." And The Register reports that the story "doesn't seem to have been around long enough to make it into Google cache."
Well, I beg to differ. For as long as it lasts, take a look here. Presumably, the reporters tried a search like cache:www.computerworld.com/securitytopics/security/virus/story/0,10801,78219,00.html which currently gives no results. If they had gone one step further and clicked on the "News" tab, they would have found the cached file. Note that the cached copy is missing the usual surrounding text and graphics. I think this is due to the way Google identifies news articles for indexing, leaving out the navigational and other surrounding text. Google News search results do not display a link to a cached copy of the story, but apparently they are there anyway. And in case the cached copy disappears from Google, I have a copy on my site.
Oh, and while I'm on the topic, I've noticed some other oddities with Google's cache. Google has two rather distinct crawls: the regular GoogleBot crawl, sometimes called DeepBot, and a smaller one that focuses on frequently refreshed content. The latter often called the FreshBot. Results from FreshBot usually have a date listed before the "Cached" link. These two crawls can have two separate cached copies at Google. For example, a search on lisnews today finds the top hit with a date of "Feb 10, 2003." Click on the "cached" link, and the latest story is actually from Feb. 9. But a direct search for cache:www.lisnews.com pulls up a page cached Jan. 11. Both pages are searchable in Google's index. But for hardcore cache users, the point is that there are two versions of the page accessible from Google, if you are willing to do a little digging.
The MyWay.com portal launched back in Oct. 2002 has expanded from just using Google for their search engine to offering four others which can be chosen as a default or used as follow-up search engines via tabs at the top. The additional search engines are labeled AltaVista, Ask Jeeves, AlltheWeb, and LookSmart. But Ask Jeeves is really just their Teoma database. AlltheWeb is the FAST database. And LookSmart is a combination of LookSmart directory entries and the Inktomi database. The "sponsored listings" continue to be ads from Google's AdWords program. Unfortunately, no advanced search forms are available except for Google which lacks several options available directly from the Google advanced search page.