March 2003 Archive
In Google News you used to be able to use advanced syntax like cache: followed by a URL to pull up a cached news story or site: to limit to a specific publication. Now these syntax no longer work and Google says "site:nytimes.com was dropped from your search because it is not supported for this type of search." For title searching, intitle: still works. Instead of site: try using source: which should be followed by either the single word for the source title that Google shows in green or for multiple word sources, use an underscore (_) character in between the words as in
source:new_york_times. Google News could really use an advanced search form and the restoration of the cached copies.
WiseNut owner LookSmart apparently bought Grub back in January. Grub is a project in "distributed Web crawling" that offers a screen-saver that will then use a computer's spare clock cycles to do some Web crawling. If this can spread widely enough it could certainly help WiseNut's crawling and make WiseNut more up-to-date. Hopefully, people will not find ways to spam the crawling process. And I am not yet seeing any update at WiseNut yet (most of their results still seem to be from Autumn 2002).
Yahoo! announces the completion of their acquistion of Inktomi. In the press release, Terry Semel, Yahoo! CEO says that "bringing together a powerful combination of Yahoo!'s global audience and unmatched breadth and depth of services with Inktomi's leading search technology, will allow us to create one of the most relevant, comprehensive and highest quality search offerings on the Web for both our affiliate partners and Yahoo!." How exactly that will be accomplished remains to be seen. Google results still dominate at Yahoo!.
As of March 7, Northern Light sites all were dead or redirecting to divine. But it looks like it has a few more gasps of life in it. The news database is updating (sporadically) and searchable again, even though the "Today's Headlines" section is not. Even the Web database is searchable again. However, don't expect it to last too long.
On top of the switch from Overture to Google, now we hear from the Hollywood Reporter says that Disney is looking at selling off the Infoseek patents and technology that used to power Go.com. Will we see a resurrection of Infoseek?
MSN Search has launched their new version (in beta testing since at least Feb. 11). On the basic search page, there is less clutter and no banner ads. The advanced search has added limits for PDF, Word, PowerPoint, and Excel documents, and the underlying Inktomi database via both interfaces now includes those kinds of files. Note that the basic search goes to LookSmart directory hits first and then Inktomi while the Advanced search goes straight to Inktomi results.
This week, Northern Light and NLResearch have been dying bit by bit. Now, all most all of the Northern Light and NLResearch URLs point to divine pages. The few that do still go to old Northern Light pages no longer have a search form that works. Northern Light News had one last gasp of content on Monday and then stopped updating again. So it looks like a final farewell to an old friend, and a search engine that had features as yet unduplicated by the survivors. Their custom folders, the taxonomy behind them, the truncation capabilities, and the combination with their fee-based Special Collection made Northern Light an important and useful tool. Unfortunately, it never achieved the popularity or profitability of others.
In addition to the other changes at AlltheWeb, the directory depth limit and the home page limit are now gone. It seemed that they originally added those limits to help get the HotBot account. Now that search feature is also gone at HotBot for their FAST database and their Inktomi database.
With all the other recent changes, this caught me totally by surprise. Remember Infoseek that became Go after Disney bought it? It dumped its own search engine back in March of 2001 and replaced it with straight Overture searches. Today it now says "Powered by Google" and gives both Google AdWords results and regular Google results. The Google steamroller moves on.
An interesting tale of censorship at Google is told and documented by Seth Finkelstein. Basically, Google removed a page from its index in Feb. 2003 after pressure from the UK. The page seems to now be gone from the Web itself, but according to various reports, it was a very sick, twisted joke page, and not the pedophile page it was claimed to be.
AlltheWeb has a brand new look. There are both cosmetic and content changes. First, the cosmetic changes:
- Banner ads are gone
- More readable results pages
- Advanced search has a new Boolean box
- New color palette
- "Streamline user interface"
- Four font size change icons are gone
- New slogan: Find it all
Even more exciting is their somewhat hidden new search feature, AlltheWeb URL Investigator. This is invoked if you search for a URL. Enter a URL, and the results page will include some if not all of the following information about that URL. See an example for loc.gov.
- FAST Facts including page language, size, and last update date
- The record and a link for the page itself
- The number of pages that link to the URL
- The number of pages that contain the term
- The number of pages at the site
- An Easywhois link to domain ownership information
- A link to the Wayback Machine copies of how the page used to look
- Subdomains at the site, if any
- Open Directory category the page is in, if any
A few other changes to note:
- There is now an easier switch for the offensive content filter. Just click On or Off in the upper right-hand corner to make the change.
- Also, the FAST Topics option is gone from the preferences settings. After months of waiting for FAST Topics to reappear, it looks like they have given up on that initiative.
- The count on the home page went from 2,112,188,990 yesterday to 2,147,483,647 today.
As of Feb. 28, 2003, Northern Light Current News has stopped being updated at both northernlight.com and nlresearch.com. I suppose it is not too surprising, since divine, Inc. (owner of Northern Light) has filed for bankruptcy earlier last week. Northern Light Current News search was a great resource because rather than searching and crawling news Web sites, it had access to actual wire news feeds. Oh well, another great resources appears to be headed for the dust bin.