March 2005 Archive
According to Urchin, Google Agrees To Acquire Urchin, an analytics tool used understand users' experiences, optimize content, and track marketing performance. According to the press release:
"We want to provide web site owners and marketers with the information they need to optimize their users' experience and generate a higher return-on-investment from their advertising spending," said Jonathan Rosenberg, vice president of product management, Google. "This technology will be a valuable addition to Google's suite of advertising and publishing products."
While Google still does not release a list of its sources for Google News (apparently secrecy is "not evil"), an interesting hack is available from PrivateRadio.org that runs a PHP script every 15 minutes and records the sources on the Google News home page. Started March 24, 2005, by today it lists over 1,000 sources which can be sorted alphabetically or by frequency of inclusion on the Google News home page.
Yahoo! has released its own Creative Commons Search which looks for Web content which has a Creative Commons license. This lets a user search for content (text, pictures, music, video, etc.) that the creator allows other to use without asking for royalties. The Creative Commons offers several different types of licenses. The new Yahoo! search has limits for "content I can use for commercial purposes" and "content I can modify, adapt, or build upon." While the Creative Commons site has had its own search engine for awhile (and it has been available in Firefox's search box), Yahoo!'s version covers many more sites and finds significantly more content. In addition, the Creative Commons search engine now also has a search box for the Yahoo! Creative Commons Search in addition to its own version. This promises to be a very useful search engine. I just hope Yahoo! adds a file type limit like the Creative Commons one has for audio, image, interactive, text, or video.
Metasearch engine Ixquick has launched a redesigned site. In their press release [pdf] mentions a re-engineered algorithm, an updated user interface, an international phone directory, and results honing which lets users remove certain results from the list. Ixquick does not include results from Google but does include Yahoo!, MSN, Ask Jeeves, Gigablast, and WiseNut.
The latest version of the Yahoo! Desktop Search has added the indexing of Yahoo! Messenger messages and the contents of your Yahoo! Address Book.
Gigablast announces the addition of related pages results on searches. These appear in a yellow box above the regular results and are supposed to include "highly relevant search results which do not necessarily contain the searchers query terms." The list of related pages can be expanded by clicking on the "more" link in the yellow box. While the initial suggestions for several searches I tried all seem to contain my search terms, expanding the list found others that did not. The exact method used to find these related pages is not specified, but it provides another way to search laterally and to expand a search beyond what other search engines may provide.
A couple of acquisitions involving search companies. First, Ask Jeeves announces that they have signed an agreement to be acquired by Barry Diller's IAC/InterActiveCorp. Second, Yahoo! has signed an agreement to acquire Flickr, the photo sharing site. See more in the Flickr blog post or the Yahoo! blog post.
LookSmart announces a redesign of its FindArticles site. With over 5 million articles from over 1,000 publications, FindArticles now has links to FURL for saving articles and also has the ability to limit a search to just the free articles available. For those with no access to a larger article database, FindArticles can be useful. But it includes fewer free articles than it used to, and many of the publications are for limited date spans. Others are not very current (with Omni for example only going up to 1995). Check for access from a local library to the larger InfoTrac, EbscoHost, ProQuest, and other databases that have much more up-to-date full text article access.
For a long time, Feedster has had detailed documentation about its advanced search commands but has noted that not all of the commands were enabled. Now, their help page for advanced searching has been completely revised and should only include features that work. Many notable advanced features are available including
- Proximity with the ability to specify how close
- Full Boolean
- Four wildcard characters
- Number range searching
- Relevance weighting
- Date and time limits
- Case sensitive searching
Google announced the launch of Google X, an interface that looks like Mac OS X by putting icons above the search box that when moused-over grow larger and name the service. This does make it easy to have more links above the search box. Unfortunately, after launching this, Google has subsequently removed it. No official word, but many presume it was removed due to copyright concerns or complaints from Apple. Anyway, for awhile at least, there is an unofficial mirror in France for the curious.
News out today about Yahoo! 360 which is due to be available in beta later this month. The site explains that 360 is designed to be used to blog, share photos and address books, post reviews, and do several other social networking activities. It is not yet available, but interested people can sign up to be on the beta waiting list.
Google has opened up a new site, Code.google.com, on which they are providing access to developer-oriented programming libraries and tools. This is intended as a site for "external developers interested in Google-related development." They plan to publish free source code and a listing of our their API services. This will be of interest primarily to programmers or those who might play with API. The initial projects include a Core Dumper, a Sparse Hashtable, and Perftools.
With the slogan "We want OpenSearch to do for search what RSS has done for content," A9 has launched its OpenSearch initiative. With OpenSearch, anyone can create their own search button for A9's options on the right. Some examples already listed on their Add More Columns page include searches for the New York Times, Flickr photos, Wikipedia, ThinkGeek, PubMed, and A9 Top Blogs.
Google still can't do it, but now Ask Jeeves joins Yahoo! in announcing a search toolbar that works in Mozilla Firefox. Once downloaded, the toolbar lets a user search Ask Jeeves, save any page browsed to MyJeeves, save specific locations, and snip content from Web pages.
Clusty, Vivisimo's new clustering meta search engine, has added a new government search available both as a tab and a direct search site. It includes FirstGov, various think tanks, U.S. political news, and some subject-specific collections.
I don't know how recently this changed, but Yahoo! used to not search the stop words within a phrase search. In other words, searching for "difference in principle" would find matches with "difference of principle." While this is a good thing for phrase searching, it breaks several neat tricks you could do with Yahoo! In particular, the Yahoo! hack to get it to search for a wildcard word within a phrase search no longer works. This also breaks Tara's YNAPS -- Yahoo Non-API Proximity Search.
Google went through this same process a few years ago. However, they began allowing the use of an asterisk * to be a wildcard word in a phrase when they started searching stop words in a phrase. I would love to see Yahoo! do the same (and for that matter allow the * to function as a regular truncation symbol)! Until that happens, for proximity searching, we now only have Exalead with its NEAR operator and the unofficial GAPS. Yahoo! Search review updated.
With MSN's new database and search engine, I've finally updated the MSN Search review to reflect the changes. I also left up the old review of MSN Search when it was using the Inktomi database from Yahoo! for the sake of comparison.
I posted a new showdown based on how well the search engines handle very long words. The long word showdown found that Gigablast ranked best for long word searching since it could handle a query with a word 1,896 characters long. Google can't handle a 155 character query that both MSN and Yahoo! can find. Who will ever search a word that long and who cares? Probably only search geeks, but hey, I was curious about it.
In its continuing move towards a portal, Google now lets users customize some aspects of the Google News front page. Users can re-arrange sections and even add customized sections with up to 9 stories based on a particular query. More information is available in the Google News: Customized News FAQ. This is available in the 9 languages and 22 local editions of Google News. While these changes certainly make Google News more useful as a starting point for news, it could use an option to reduce the size of each listing. There is a "show headlines only option" which removes the first sentence and images. But it is the extra links to other sources for the same or similar story that take up too much screen space. And as Chris notes, there is no RSS feed option.
Yahoo! has made a fairly major change to its Directory. The Yahoo! Directory was the original Yahoo! product, but it has been greatly de-emphasized in recent years. Now, the front page has further reduced the category listings to the left margin and has put featured sites front and center. In addition, as Danny notes, the entries within categories are ranked in some sort of "popularity" order. Fortunately, an alphabetical listing is still available as an option.
Not only is there now an updated version of the Google Desktop Search client, but it is no longer in beta. It only runs on Windows XP or Windows 2000 SP 3 or above. But the new version indexes more content (but still not everything). New content types include Netscape Mail, Thunderbird Mail, Netscape/Firefox/Mozilla Web browsing, PDFs, and any meta tags associated with music, image, and video files.
A recent article "Google's Cookie and Hacking Google Print" describes techniques used to write a script that can create PDFs of entire copyrighted books from Google Print. [More comments on it at Kuro5hin.] The full code is not available and the author let Google know about the issue, but the point is that despite some clever programming on Google's part, there can be numerous ways of getting around the copyright restrictions once a book is in a publicly-accessible electronic format. Not a problem for the out of copyright books, but for those still under copyright . . .
There are problems reported with Google's Wildcard Word in a Phrase. The problem is that the asterisk seems to represent either zero or one word. It used to represent exactly one word. For example,
"a little * * * mischief" used to find only "a little neglect may breed mischief" or a similar phrase of six words. Now it also finds pages with just "a little mischief." The cache copy on those pages says that the search terms only appear in pages pointing to the resulting page, but that does not seem accurate. I think that what now happens is that in addition to the way it used to work, Google now also ORs the results of the same search as if the asterisks were not in the query.
Continuing with its trend to add more portal features, Google now has quick access to current weather conditions and a four day forecast for U.S. cities and ZIP codes. Strangely enough, the weather information does not link to a source for more detailed weather information for the locality, even though the Weather Underground (from which Google gets its weather information) does have more detailed conditions and forecasts. This is also available via SMS by sending a text message to the U.S. five digit shortcode 46645 (GOOGL on most mobile phones) followed by the weather query. As Gary notes, Google is finally catching up with a feature that AltaVista offered back in 2002.
Ask Jeeves has "simplified" its picture search results. Now, it only a thumbnail and a link below it labeled "source." No file name, URL, size, or dimension information is available like that offered at both Yahoo! and Google. Even MSN offers most of that (except the file name). Picsearch, which supplies the image database to Ask, at least offers dimension and file size. While some will like the cleaner look, I prefer the services that offer more information about the image on the search results page itself.
Happy Birthday! Ten years ago, Yahoo! was incorporated. To celebrate, they have a letter from their founders along with a ten year retrospective called Yahoo! Netrospective: 10 years, 100 moments of the Web.
Lycos has dropped its Yahoo! search engine (formerly known as Inktomi) in favor of an Ask Jeeves database (also known as Teoma). The Ask Jeeves press release includes the following: "The Lycos brand is known for search and we're committed to re-establishing Lycos.com as a leading search site," said Adam Soroca, general manager of search services at Lycos, Inc. "Ask Jeeves' Teoma search technology will deliver outstanding results to users of our new search-centric experience at Lycos.com."
Yahoo! Search now offers access to its application programming interfaces (APIs) via the Yahoo Search Developer Network. These are tools for programmers and scripting folks. The site notes that APIs can use data from the Web, Image, Local, News, and Video databases.
Yahoo! has announced that Overture, their text ad division, will be renamed Yahoo! Search Marketing Solutions. Originally known as GoTo, the company became Overture in 2001. Yahoo! then bought Overture in 2003 and now starts the rebranding drive. The official U.S. name change is scheduled for the next few months with the international changes coming later.