Search Engine Showdown
 
 
Yahoo!

Review of Yahoo! Search

Last updated Mar. 31, 2008.
by Greg R. Notess

Yahoo! is one of the best known and most popular Internet portals. Originally a subject directory of sites, it now is a search engine, directory, and portal. To go to the Yahoo! portal and main starting point, use www.yahoo.com. For direct access to the search engine, use search.yahoo.com and for the directory use dir.yahoo.com. This review primarily coves the search engine features. Use the table of contents on the left to navigate this review.

For More on Yahoo!
See News & Blog Posts
Directory Review

Databases: On a search, a few links to internal Yahoo! content may be shown along with sponsored links (ads), and then the bulk of results come from Yahoo!'s own search engine database, introduced in Feb. 2004. Before that point, Yahoo! used search engine databases from other vendors. Yahoo! has many databases, including:

Strengths:
  * A large, unique search engine database
  * Includes cached copies of pages
  * Includes links to the Yahoo! directory
  * Supports full Boolean searching
  * Wild Card Word in Phrase

Weaknesses:
  * Lack of some advanced search features such as truncation
  * Only indexes first 500 KB of a Web page
  * Link searches require the inclusion of the http://
  * File type search uses originurlextension: rather than filetype:
  * Includes some pay for inclusion sites

Default Operation: Multiple search terms are processed as an AND operation by default. So adding more terms should get fewer results.

Boolean Searching: Yahoo! supports Boolean operators and nested searching with the operators AND, OR, and NOT. Either AND NOT or NOT can be used. Searching can be nested using parentheses. Operators must be in upper case. Yahoo! can also use - for NOT but only when it is not used along with the Boolean operators. In the Advanced Search, it also has forms for "all of these words," "any of these words," and "none of these words."

Proximity Searching: Phrase searching is available by using "double quotes" around a phrase. While no other official proximity is available, the Wild Card Word in Phrase technique (see Truncation, below) can be combined with ORs to rather tediously create a proximity search.

The following worked until 3/13/05: [While not officially supported by Yahoo!, Tara Calishain has created YNAPS -- Yahoo Non-API Proximity Search which combines the wild card word within a phrase capability (see below) with the OR operator to get up to five word proximity searches.]

Truncation: No truncation is available nor is there any automatic plural searching or word stemming.

Yahoo! can search a Wildcard Word in a Phrase. Use an asterisk * within a phrase search to match any word in that position. So, for example, to find "addictive semiconscious vice of biblioscopy" when you are not sure of the third word, search "addictive semiconscious * of biblioscopy". Multiple stop words can be used as in "addictive * * of biblioscopy". This is the only way Yahoo! supports a wildcard symbol.

The following worked until 3/13/05: [Using stem: before a term would run an English-language stemming algorithm on that particular word. For example, stem:flood matched on 'flood' as well as 'flooded.'] 

Case Sensitivity: Yahoo! has no case sensitive searching. Using either lower or upper or mixed case will result in the same hits.

Field Searching: Yahoo! Search supports several field searches. Use the syntax below, or in the Advanced Search, use the drop down choices for title and URL. The title word search can also be entered in the query box by using the intitle:word syntax. The first six field searches are listed in the documentation, but the others appear to work and come from old Inktomi commands.

Field Explanation
intitle: or title: Hits have the term in the HTML title element.  intitle:showdown
site: or domain: Limits to a particular domain or subdomain.  site:notess.com
hostname: Similar to site: Limits to a specific host. hostname:dept.stateu.edu
link: Finds pages that links to a specific URL (which must include the http://).  link:http://dir.yahoo.com
linkdomain: Limits to pages containing links to the specified domain. linkdomain:notess.com (do not include the http://)
url: Finds exactly one specific URL in the database. url:http://searchengineshowdown.com/features/
inurl: Finds parts (words) within indexed URLs.   inurl:features
originurlextension: File type limit. Use file extension after colon.  originurlextension:pdf
feature:acrobat Page links to Adobe Acrobat PDF files
feature:applet Page has embedded Java applets
feature:activex Page has ActiveX controls or layouts
feature:audio Page links to several audio formats
feature:flash Page links to or has Flash files
feature:form Page uses forms
feature:frame Page use frames
feature:homepage Finds personal pages which use a tilde ~ before their directory
feature:image Page includes gif, jpg, and other image files
feature:index Home pages only
feature:javascript Pages use JavaScript
feature:meta Page includes meta tags
feature:script Page has embedded scripts
feature:shockwave Page links to or has embedded shockwave files
feature:table Page uses tables
feature:video Page links to or has embedded video files
feature:vrml Page links to VRML files

Limits: Yahoo! has limits for language, domain, date, file type, country, and adult content limits. The date limit is available on the Advanced Search page. Only three options are available: Past 3 Months, Past 6 Months, or Past Year.

The file type limit is also available on the Advanced Search page. It offers file type limits under the label of File Format for HTML, PDF, Excel (.xls), PowerPoint (.ppt), Word (.doc), RSS/XML  (.xml), and Text Format (.txt). The file type limit can also be used on the command line with the originurlextension: prefix followed by the extension. Using the filetype: prefix, the file type limit can also be used for PostScript (.ps), WordPerfect (.wpd), and probably other file extensions. To use the prefix command, just put the extension immediately after originurlextension: as in differentials originurlextension:ps. Yahoo! will sometimes offer an HTML versions of PDFs and some  other file types.

The language limit is available on the Advanced Search page and include the following 36 as of March 2005.

Multiple languages can be chosen.

A SafeSearch filter tries to exclude adult Web pages. It can also be turned on from the preferences page.

The domain and country limits on the advanced search pages can be expanded to groups by adding the following codes to other search terms:

The Advanced Search page offers a site/domain limit, which can be used to limit results to those from the specified domain.

Stop Words: Some common words are supposedly ignored and can be searched with a + in front, but in practice it seems that all words are now searchable. There are no stop words in phrase searches. And a search on the the and to be or not to be both work without phrase markings. Before March 2005, the one exception was in phrase searches where common words such as 'of,' 'the,' 'a,' 'in,' '2,' and 'www' acted only as placeholders and could represent any word.

Sorting: Results are sorted by a relevance algorithm. Pages are also clustered by site. Only one page per site will be displayed. Others are available via the More pages from this site link after the Cached link at the end of the record. If the search finds less than 1,000 results when clustered and if you page forward to the last page, after the last record the following message will appear:

In order to show you the most relevant results, we have omitted some entries very similar to the ones already displayed. If you like, you can repeat the search with the omitted results included.

Clicking the "repeat the search" option will bring up more pages, some of which are near or exact duplicates of pages already found while others are pages that were clustered under a site listing. However, clicking on that link will not necessarily retrieve all results that have been clustered under a site. You can also just add &dups=1 to the end of a search results URL. To see all results available, you need to check under each site cluster as well as using the "repeat this search" option.

Display: Yahoo! provides results in seven categories, shown in the graphic reproduced below. The first listed results under Web are from the search engine with the page title, a keyword in context extract (or directory description or meta description), the URL, file size, cache link, and a possibly a More pages from this site link. The second and third items link to their image and video databases. The Yahoo! directory results are available under the Directory heading. The local link goes to local and yellow page results. The News link goes to the Yahoo! News database while the Products tab, introduced Fall 2003, goes to the Yahoo! Shopping search. This breakdown used to use tab images, but that changed sometime between April 2004 and Feb. 2005.

bar

Documentation:
Search Help
Yahoo! Company Information
Press Releases