September 2002 Archive
« August 2002 | Main | October 2002 »
Google Dance Begun
The Google dance appears to have begun yesterday and there is much weeping and gnashing of teeth in the optimization community. The Webmaster World forum thread discussing the update already has over 430 posts since it started yesterday morning. What is the Google dance? It occurs when Google is launching a new database, first on www2.google.com or www3.google.com and then eventually on the main site. I can take several days until the whole new dance to finish. So right now on the main www.google.com, the bulk of their database appears to have come from an early August crawl. The database on www2 is from a late August early September crawl. So searchers take note. Try www2 for the more current records, but expect changes over the next few days from what you got at Google earlier this week.
So why all the frantic discussion in the forums. It seems that Google may have made a more significant change than usual to their relevance ranking algorithms. According to a related Webmaster World thread the changes have moved Microsoft out of the top spot for a phrase search on "go to hell" and perhaps has increased the importance of anchor text from the Open Directory. Again, the point for searchers is that the results will likely change compared to what you have seen. Whether better or worse relevance ranking remains to be seen, but it will probably depend greatly on the search terms.
Gigablast Now Offering Site Search
Gigablast may be around for awhile if it can make a go at offering site search to paying customers. The announcement appeared on their site today stating that "for a teeny fraction of the other guys' prices you can have an account on Gigablast.com that can support millions of web pages." Of course, the monthly price for a million page site still costs US$2,500.
Updated Reviews
I've made minor updates to several reviews including HotBot, Lycos, MSN, Fast Search, and Inktomi. Changes include updates on which Inktomi features work at HotBot and MSN, and a note on how to get the MSN advanced search to work without a search term. (Thanks, Gary, for that tip.) I've also added the Postion Tech Inktomi search to the Inktomi review.
Daypop Back?
Daypop, the recent news and Weblog search engine appears to be back up after being down for several weeks. The front page still says that it is out of disk space, but it is working again. The Top 40 and Top News are not yet functioning, but the search engine is. For more on Daypop and blogging, see my article in the latest issue of ONLINE: "The Blog Realm: News Sources, Searching with Daypop, and Content Management." ONLINE 26(5): 70-72, Sept.-Oct. 2002. And
KWIC & site: at AlltheWeb
While Google gets lots of press for its relaunch of its news search, AlltheWeb has been busy the past few weeks. As of yesterday, they have started rolling out a dynamic keyword-in-context (KWIC) extract in the results list. Available at Lycos for awhile now, this feature is finally making its way into the AlltheWeb results. This is the kind of display Google usually provides where the extract contains the actual search terms along with some of the surrounding text.
In addition, AlltheWeb has a new field search of site: which is more precise and easier to remember than their older url.host and url.domain:. I've updated the AlltheWeb Review and the example in the Fields section notes that site:www.total.com finds different results than site:www.total.com.au.
Give it a try, but expect that there may be changes to the way it works over the next few weeks.
Google News Tabbed, Updated, & Expanded
The Google News has greatly expanded its number of news sources (to "approximately 4,000") and the depth of its archive. It also has a newly redesigned look and has finally added the News Tab on the main page and on search results pages. According to the About page
"Google News continuously crawls more than 4,000 news sources from around the world. This number will continue to grow as we develop the service further" and it now "includes articles that appeared within the past 30 days." There is still no advanced search, although Tara points out that adding &num=100 to the end of a results URL will give 100 results at a time. Even easier, just change your regular Google preferences to default to 100, and you don't even need to add the special code.
I can't say I'm impressed with the "Google News is highly unusual in that it offers a news service compiled solely by computer algorithms without human intervention" boast or the lack of a list of those 4,000 sources. However, the results are certainly much broader than what was offered before.
Yahoo! Ditches Research Documents?
Whether it is a temporary glitch or a permanent change, Yahoo! is not giving "Research Documents" as another category in their search result today. Previously, the "Research Documents" link show up on a search results page after "Web Pages" and "News." The link was to full-text articles from divine, Inc.'s Northern Light Special Collections. A Yahoo! Help file still describes them, but the link is gone today.
WiseNut Number Gone
I just noticed today that WiseNut no longer displays a number in the upper left corner. Formerly, WiseNut posted " 1,571,413,207 Web pages and counting!" there. Now that they have finally launched a fresher database (as I posted earlier) apparently they are either no longer counting, or more likely, it is a smaller overal database. Strangely enough, the old number is still up on their corporate contact page.
Site Redesign Begins
I am finally starting the slow process of updating the site design for Search Engine Showdown. At this point, part of the site has been converted, and over the next few weeks, I hope to finish the rest. In addition to the redesign, news and updates from the past few months are finally being posted. See in particular the new reviews for Gigablast and Openfind, an updated search engine features chart and the search engines by features page. There is a new news archive which includes subject access to news postings (at least for those since about May 2002).
I know there is lot more to be done, but if you have any opinions about the new design or other comments about the site, email me at greg@notess.com. Oh, and if you want to link to a particular news post, use the [link to this story] link at the end for a story-specific URL.
WiseNut Refreshed
At some unknown and unannounced point in the past month, WiseNut finally refreshed its database. For most of the past year, WiseNut had almost no new content from any later than July 2001. By this past July, it was a year-old database. Now, it has launched a new database (even though it still claims the same 1.5 billion pages) which appears to be primarily from May 2002. So while it is still not very up-to-date, it is much fresher than it used to be.
Google News Changes
Gary Price points out that some changes are going on at the Google News search. Search engines like to experiment by giving one out of say a thousand queries the experimental interface or results and then gauging their reactions. That makes it hard for the rest of us to see the details of the experiment unless someone grabs a quick screen shot. Just earlier this week on Yahoo! I noticed that the "Web Pages" link was not highlighted unless you clicked on other of the other links first. And the Powered by Google had moved way down to the bottom. Was this the beginning of a change to another search engine or an attempt to lessen the amount they pay to Google? Or what it just Yahoo! experimenting with some different approach. Time may or may not tell.
Fast Company Article on FAST
Interesting article on FAST from the unrelated but similarly-named Fast Company magazine.
AlltheWeb Alchemist Contest Deadline Extended
AlltheWeb has extended the deadline for the AlltheWeb Alchemist Contest to Oct. 31, 2002.
Wayback Machine Behind
I have been disappointed for awhile that the Wayback Machine from the Internet Archive has had no new pages included for most of this year. In today's TVC News, Gary Price reports that he got a reply from them saying that "they are about six to eight months behind in adding data to the archive. But they expect to make pages from the first half of 2002 available during the next four weeks."
HotBot UK & DE Switch to FAST
People posting in the WebMaster World forums report that both HotBot UK and HotBot Germany have abandoned Inktomi and moved to Fast. Considering Lycos' stake in FAST it is surprising this did not happen sooner, but it has not yet changed at the U.S. HotBot.com. Unfortunately, the new underlying data at these HotBots has not changed some problems such as putting a 'see results from this site only' link even when there is only one hit from that site.
AlltheWeb First to Search Flash
AlltheWeb announces the ability to search Macromedia flash files. This is the first search engine to include these file types in their database. In addition, they have added several new features to their Advanced Search page. Like HotBot and MSN, AlltheWeb now has an "Embedded Content" option for finding pages that contain (or do not contain) the following file types: images, audio, video, RealVideo or RealAudio, Flash, Java applets, Javascript, and VBScript. They have added a regional domain limit based on top level domain groupings. They have expanded their date limit to cover any specified dates, a document directory depth, a personal home page limit, and a Flash file format limit. They also a new field search in the Word Filters section: "in the host name." Many of these advanced features are ones that Inktomi search engines have offered for years. Does this mean that Lycos-owned HotBot may switch to FAST soon?
AltaVista On Being Blocked in China
AltaVista issued a press release about their search engine being blocked in China. It includes several ways for Chinese users to get around the block by going to other AltaVista sites.
Inktomi Does Concepts
Danny Sullivan reports on Inktomi's new 'conceptual search' which Danny prefers to call anti-proximity. The idea is that for single term, Inktomi's ranking will prefer uses of the term by itself rather than in common phrases. For example, a search on 'york' or 'mexico' will push pages to the top where those terms are used by themselves rather than in other common phrases like 'new york' or 'new mexico.' It's an interesting approach that other search engines may wish to consider.
InfoSpace Gets Google
True search engines do not always like meta search engines that seem to freeload on their hard-built databases and yet contribute not cash to the process. Google usually blocks meta search engines from retrieving Google's results, but now they have reached an agreement with InfoSpace to include regular search results and ads from Google's AdWords database on their meta search engines including Dogpile, Excite, WebCrawler, MetaCrawler, and InfoSpace. See the InfoSpace press release or the one from Google.
Results Display Experiments at FAST
Pandia reports and translates some interesting experimental work FAST is doing on the display of results. It involves "technology that recognizes person names and geographical locations in all types of documents." You can see a screen shot in the original Norwegian article at Digi.no.

Subscribe