Review of Yahoo! Search
Mar. 31, 2008.
by Greg R. Notess
Yahoo! is one of the best known and most popular Internet portals. Originally a subject directory of sites, it now is a search engine, directory, and portal. To go to the Yahoo! portal and main starting point, use www.yahoo.com. For direct access to the search engine, use search.yahoo.com and for the directory use dir.yahoo.com. This review primarily coves the search engine features. Use the table of contents on the left to navigate this review.
Databases: On a search, a few links to internal Yahoo! content may be shown along with sponsored links (ads), and then the bulk of results come from Yahoo!'s own search engine database, introduced in Feb. 2004. Before that point, Yahoo! used search engine databases from other vendors. Yahoo! has many databases, including:
- Web Pages
- Sponsor Results
- Yahoo! directory
- Local (Yellow Pages and Maps)
- Other databases provide much of the information from the portal side of Yahoo!
* A large, unique search engine database
* Includes cached copies of pages
* Includes links to the Yahoo! directory
* Supports full Boolean searching
* Wild Card Word in Phrase
* Lack of some advanced search features such as truncation
* Only indexes first 500 KB of a Web page
* Link searches require the inclusion of the http://
* File type search uses
originurlextension: rather than
* Includes some pay for inclusion sites
Default Operation: Multiple search terms are processed as an AND operation by default. So adding more terms should get fewer results.
Boolean Searching: Yahoo! supports Boolean operators and nested searching with the operators AND, OR, and NOT. Either AND NOT or NOT can be used. Searching can be nested using parentheses. Operators must be in upper case. Yahoo! can also use - for NOT but only when it is not used along with the Boolean operators. In the Advanced Search, it also has forms for "all of these words," "any of these words," and "none of these words."
Proximity Searching: Phrase searching is available by using
"double quotes" around a phrase. While no other
official proximity is available, the Wild Card Word in Phrase technique (see
Truncation, below) can
be combined with ORs to rather tediously create a proximity search.
The following worked until 3/13/05: [While not officially supported by Yahoo!, Tara Calishain has created YNAPS -- Yahoo Non-API Proximity Search which combines the wild card word within a phrase capability (see below) with the OR operator to get up to five word proximity searches.]
Truncation: No truncation is available nor is there any automatic plural searching or word stemming.
Yahoo! can search a Wildcard Word in a
Phrase. Use an asterisk * within a phrase search to match any
word in that position. So, for example, to find
"addictive semiconscious vice of biblioscopy" when you are not sure of the
third word, search
"addictive semiconscious * of
biblioscopy". Multiple stop words can be used as in
* * of biblioscopy". This is the only way Yahoo! supports a
The following worked until 3/13/05: [Using
before a term would run an English-language stemming algorithm on that
particular word. For example,
stem:flood matched on 'flood'
as well as 'flooded.']
Case Sensitivity: Yahoo! has no case sensitive searching. Using either lower or upper or mixed case will result in the same hits.
Field Searching: Yahoo! Search supports several field searches. Use
the syntax below, or in the Advanced Search, use the
drop down choices for title and URL. The title word search can also be
entered in the query box by using the intitle:word syntax. The first six
field searches are listed in the documentation, but the others appear to
work and come from old Inktomi commands.
|intitle: or title:||Hits have the term in the HTML title element. intitle:showdown|
|site: or domain:||Limits to a particular domain or subdomain. site:notess.com|
|hostname:||Similar to site: Limits to a specific host. hostname:dept.stateu.edu|
|link:||Finds pages that links to a specific URL (which must include the http://). link:http://dir.yahoo.com|
|linkdomain:||Limits to pages containing links to the specified domain. linkdomain:notess.com (do not include the http://)|
|url:||Finds exactly one specific URL in the database. url:http://searchengineshowdown.com/features/|
|inurl:||Finds parts (words) within indexed URLs. inurl:features|
|originurlextension:||File type limit. Use file extension after colon. originurlextension:pdf|
|feature:acrobat||Page links to Adobe Acrobat PDF files|
|feature:applet||Page has embedded Java applets|
|feature:activex||Page has ActiveX controls or layouts|
|feature:audio||Page links to several audio formats|
|feature:flash||Page links to or has Flash files|
|feature:form||Page uses forms|
|feature:frame||Page use frames|
|feature:homepage||Finds personal pages which use a tilde ~ before their directory|
|feature:image||Page includes gif, jpg, and other image files|
|feature:index||Home pages only|
|feature:meta||Page includes meta tags|
|feature:script||Page has embedded scripts|
|feature:shockwave||Page links to or has embedded shockwave files|
|feature:table||Page uses tables|
|feature:video||Page links to or has embedded video files|
|feature:vrml||Page links to VRML files|
Limits: Yahoo! has limits for language, domain, date, file type, country, and adult content limits. The date limit is available on the Advanced Search page. Only three options are available: Past 3 Months, Past 6 Months, or Past Year.
The file type limit is also available on the Advanced Search page. It
offers file type limits under the label of File Format for HTML, PDF, Excel
(.xls), PowerPoint (.ppt), Word (.doc), RSS/XML (.xml), and Text
Format (.txt). The file type limit can also be used on the command line with
originurlextension: prefix followed by the extension. Using
filetype: prefix, the file type limit can also be used for
PostScript (.ps), WordPerfect (.wpd), and probably other file extensions. To
use the prefix command, just put the extension immediately after
originurlextension: as in
Yahoo! will sometimes offer an HTML versions of PDFs and some other
The language limit is available on the Advanced Search page and include the following 36 as of March 2005.
- Chinese (Simplified & Traditional)
Multiple languages can be chosen.
A SafeSearch filter tries to exclude adult Web pages. It can also be turned on from the preferences page.
The domain and country limits on the advanced search pages can be expanded to groups by adding the following codes to other search terms:
The Advanced Search page offers a site/domain limit, which can be used to limit results to those from the specified domain.
Words: Some common words are supposedly ignored and can be searched
with a + in front, but in practice it seems that all words are now
are no stop words in phrase searches. And a search on
to be or not to be both work without phrase markings. Before March 2005, the one exception
was in phrase searches where common words such as 'of,' 'the,' 'a,' 'in,'
'2,' and 'www' acted only as placeholders and could represent any word.
Sorting: Results are sorted by a relevance algorithm.
Pages are also clustered by site. Only one page per site will be displayed.
Others are available via the More pages from this
site link after the Cached
link at the end of the record. If the search finds less than 1,000 results
when clustered and if you page forward to the last page, after the last
record the following message will appear:
In order to show you the most relevant results, we have omitted some entries very similar to the ones already displayed. If you like, you can repeat the search with the omitted results included.
Clicking the "repeat the search" option will bring up more pages, some of
which are near or exact duplicates of pages already found while others are
pages that were clustered under a site listing. However, clicking on that
link will not necessarily retrieve all results that have been clustered
under a site. You can also just add
&dups=1 to the end of a
search results URL. To see all results available, you need to check under
each site cluster as well as using the "repeat this search" option.
Display: Yahoo! provides results in seven categories, shown in the graphic reproduced below. The first listed results under Web are from the search engine with the page title, a keyword in context extract (or directory description or meta description), the URL, file size, cache link, and a possibly a More pages from this site link. The second and third items link to their image and video databases. The Yahoo! directory results are available under the Directory heading. The local link goes to local and yellow page results. The News link goes to the Yahoo! News database while the Products tab, introduced Fall 2003, goes to the Yahoo! Shopping search. This breakdown used to use tab images, but that changed sometime between April 2004 and Feb. 2005.