Please note: This page is an earlier version of our current Review of Image Search Engines, which we have archived for reference purposes only. Links on this page are no longer being checked and may no longer work. Please refer to our current Review of Image Search Engines for the latest information.
This is the third TASI review of image search engines (earlier reviews have been archived to enable comparison: 2003 and 2004). It is eighteen months since the last TASI review, which is a long time in the world of search engines. By the time of that review (October 2004) there had been a shake-up of image search provision among the general search engines, with Yahoo and Google emerging as the major players. There has been less change over the past eighteen months, with Yahoo and Google still vying for top place among the general search engines. The most notable changes have been the recent arrival of Ask’s image search and the growth of Picsearch, which now provides image search services to several of the general search engines. These developments are discussed in detail below (section 2). Readers should note that the information presented in this review is correct for May 2006: if you are reading this some time after that date, you are advised to check the sites carefully and to ask the kinds of questions TASI has included in its evaluation criteria (below).
The phrase ‘Image Search Engine’ is typically used of Web-based services that collect and index images from other sites on the Internet. Image searching is offered by general search engines, like Google or Yahoo, and by specialised image search engines - services devoted to the searching of images or multimedia. In addition, there are meta- search engines, which pass on search requests to more than one search engine and then bring back the results.
Sometimes people use ‘Image Search Engine’ to refer to collection-based search engines - services that index a single or small number of image collections. Large digital libraries, commercial stock photo collections like Corbis, or community-based collections like Flickr typically offer their own search engine-like facilities.
The types of image search engines mentioned above are text-based - their indexes are created from words associated with the images. In addition, there have been attempts to create content-based search engines, which ‘index’ visual characteristics of an image, such as its shape and colour. To date, these attempts are still largely experimental and usually limited to single image collections.
This critical review concentrates on general, specialised, and meta- search engines (section 2), but also covers, briefly, collection-based and content-based engines (section 3 and section 4). It only deals with English-language search engines and collections.
Image search engines are based on existing search engine technology, but they use additional strategies to identify, categorise and rank images.
A search engine’s indexing of images is done automatically, rather than using human indexers, so it must find ways to guess at the image’s content. The particular algorithms a search engine uses are commercially valuable and so not usually disclosed, but they are likely to take into account the image’s filename and any accompanying ‘ALT’ picture tags (these are coded into the HTML page). They might also look for clues from the image’s context - for example, the words or phrases that are close to the image, or the ‘META’ tags found at the top of the HTML coding. The nature of the Web site and its provider are also likely to be taken into account.
Analysis of an image’s text and context can be used to exclude images as well as include them - for example, an image engine will usually consider an image’s context and associated words when it is blocking out adult material.
In reviewing image search engines, it is helpful to use an evaluation criteria and a standard set of search terms. Although the exercise is still subjective, these will offer some basis for comparison. Readers familiar with previous reviews should note that TASI has changed its search terms for this review.
| Evaluation Criteria In reviewing image search engines, TASI asked the following questions: |
|
| Scope |
|
| Search Options |
|
| Performance |
|
| Presentation |
|
| Support |
|
| Search Terms TASI tested the image search engines using the following terms: (Note that many of these terms have been changed from previous reviews) |
||
| Category | Terms and phrases | Rationale |
|---|---|---|
| General search terms | Eagle Leaf |
General, relatively unambiguous, terms |
| Specific search terms | Raphus cucullatus [the dodo bird] Chaucer |
Scientific name to test the indexing and an historic figure to test scope and relevancy |
| Ambiguous search terms | Homer Queen Icons West Bank |
To test the weighting of the indexes and their handling of phrases |
| Specific images | ball4_cols.gif (an illustration from the TASI Web site: http://www.tasi.ac.uk/ advice/creating/image.html) _41488494_blairoz203getty.jpg (an image from the BBC News website: http://news.bbc.co.uk/1/hi/ world/asia-pacific/4848278.stm) Les Demoiselles d’Avignon (a painting by Picasso which is still in copyright; an official version of the image can be seen on the Museum of Modern Art Web site: http://www.moma.org/collection/ conservation/demoiselles/index.html) |
Two unique filenames to test the comprehensiveness and currency of coverage: the TASI image has been online for several years; the BBC image, less than a week. The Picasso image is included to see how a particular in-copyright image is included and presented |
TASI’s tests were carried out during the first week of April, 2006. For each search engine reviewed we searched on all of the terms and tried to answer all of the questions - the results are presented here in a summary form for convenience. We have also given some overall evaluations (poor, disappointing, good etc.). These are obviously subjective judgements, but are based on the more detailed evaluations and on a comparison of all the engines reviewed.
In TASI’s 2003 review we looked at Google, AlltheWeb, AltaVista, and Lycos in this category, suggesting that Google and AlltheWeb provided the best image searching among the general search engines. In our 2004 review, we found there had been a shake-up in provision, with Yahoo acquiring image search technology and offering very strong competition to Google. At the time of that review (October 2004), Yahoo was also supplying image search services to Lycos, AltaVista, and AlltheWeb; Google was supplying AOL Search; and the specialised image search engine Picsearch was supplying Ask Jeeves.
Now, in mid-2006, Google, Yahoo and Picsearch are still the major providers of image search results and have recently been joined by Ask (formerly Ask Jeeves). In addition to their own search engines, Google now supply A9 (http://a9.com/, Amazon’s search engine) and AOL (http://search.aol.com/ or http://search.aol.co.uk/); Yahoo supplies AlltheWeb (http://www.alltheweb.com/) and AltaVista (http://www.altavista.com/); and Picsearch supplies Lycos (Europe) (http://www.lycos.co.uk/) and MSN Search (http://search.msn.com/).
We provide detailed reviews for Google, Yahoo and Ask in this section and for Picsearch in section 2.2. Because the other main search engines draw their results from other providers, we give brief overviews of their services rather than full reviews (although full tests were done). In most cases, the services offered by these other engines are more limited than those providing the images.
http://www.ask.com/?o=312#subject:img|pg:1
Ask is the new incarnation of Ask Jeeves - an established search engine which adopted a more personal approach to searching, engouraging its users to ask real questions (e.g. ‘how do I get to Bristol?’) rather than type in search terms. Jeeves the helpful butler has recently retired and Ask has been made-over (see http://www.irconnect.com/ask/pages/news_releases.html?d=94894). Previously, Ask Jeeves relied on Picsearch for its image searching. By March 2005 it was applying its own sorting/relevancy algorithm to the Picsearch results and by the end of January 2006 it was ready to launch its very own image search engine (see http://www.irconnect.com/ask/pages/news_releases.html?d=92949). Because of this Ask now enters the ranks of TASI’s search engine review.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Big and broad | Ask does not advertise its image statistics. In our tests it reported about a third less results than Google or Yahoo and similar numbers to Picsearch. |
| Search Options | OK | Ask does not provide an advanced image search or any initial search limitation (e.g. by size or colour). However, once some results are found, it provides very useful lists of terms that may be ‘related’ or may enable you to ‘expand’ or ‘narrow’ your search. |
| Performance | Good | Quick and generally good results. We observed some occasional duplication (usually the same image on different sites). It’s important to note that although large numbers of results are reported, Ask does not enable its users to view them all. |
| Presentation | OK | The results page provides thumbnails (a fixed 16 at a time) with no other information about the images. Once clicked, the thumbnail or ‘info’ link will open up a split screen with Ask on top and the source page below. The split screen view provides a filename and URLs for both the image and the Web page on which it is found. There is a copyright warning and a link to a disclaimer. On both the results page and the split-screen view there is a ‘save’ option. No explanation is given to the user, but this link will save the image to a user’s personal space, which can later be accessed via this link: http://mystuff.ask.com/. |
| Support | OK | There is no immediately obvious ‘help’ information. The ‘About’ page has a link to ‘Help Central’, although this provides very little information about image searching. It does, however, provide a contact form where users can request specific help. |
| TASI’s test words | Good | Ask did not find the TASI or BBC image, but provided generally good results for the other searches. |
Google is the most popular search engine. From its launch in 1998, it concentrated on providing the best search facilities, long resisting the temptation to turn itself into a full-blown portal, which had proved the downfall of some of its rivals. In recent years, however, Google has been accumulating a lot of extra services (e.g. email, blogs, mapping), carefully tucked away behind its minimalist front page. This growth seems to have accelerated since its share flotation in late 2004.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Very big and broad | Search engine statistics can be difficult to obtain, but Google’s blog reported 1.2 billion images in February 2005 - see http://googleblog.blogspot.com/ 2005/02/get-picture.html. In our tests Google typically reported larger results than Yahoo for the terms searched (although neither lets you see them all). |
| Search Options | Very good | Includes a simple search and a good advanced search. The latter offers Boolean and phrase searching, but does not support wildcards or truncation. Results can be limited by size, file type, colour or Web domain, with adult material filtered in a ‘strict’ or ‘moderate’ way. Once results are retrieved, Google enables its users to search within them or to filter the results by size (‘large’, ‘medium’, or ‘small’). |
| Performance | Very good | Quick and generally good results. We spotted the occasional dead link. It’s important to note that although large numbers of results are reported, Google does not enable its users to view more than 1000 image results. |
| Presentation | Very good | Thumbnails first (a fixed 20 at a time) which link to a split screen with Google on top and the source page below. On the results page the user is given the ALT text (where available), filename, pixel dimensions, filesize and the location URL. Usefully, Google groups together similar pictures from the same site, presenting one to the user and offering a link to ‘more’. This can make for a better set of results. On the split screen view, the URL, pixel dimensions and filesize are given, accompanied with a notice that the image might be scaled down and subject to copyright. In addition to seeing the image in the lower frame, the user can view the original image by itself. |
| Support | Good | There is a link next to the search button to online help (in an FAQ format). However, there is no obvious way to contact anyone with a specific query. |
| TASI’s test words | Good | Found both the TASI and BBC images on a filename search (although the BBC image was from a different article on the BBC site). Provided generally good results for the general and specific search terms and a good variety of images for the more ambiguous terms - the search on ‘Queen’, for example, provided a mix of images of the rock band and royalty, with the occasional person wearing drag. |
http://images.search.yahoo.com/
Yahoo began life as a Web directory and until February 2004 was drawing most of its Web site search results from Google. During 2003 and 2004, through some strategic purchases, Yahoo established itself as a major search engine player and serious competition to Google.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Very big and broad | Search engine statistics can be difficult to obtain, but Yahoo’s blog reported 1.6 billion images in August 2005 - see http://www.ysearchblog.com/archives/ 000172.html. In our tests Google typically reported larger results than Yahoo for our search terms (although neither lets you see them all). |
| Search Options | Very good | Includes a simple search and a good advanced search. The advanced search provides almost identical functionality to Google: Boolean and phrase searching, but not wildcards or truncation. Results can be limited by size, colour or Web domain, with an adult material filter turned ‘on’ or ‘off’. Once a search has been done, Yahoo enables its users to filter the results by size (‘wallpaper’, ‘large’, ‘medium’, or ‘small’) or colour (‘color’ or ‘black & white’). It also has a very useful ‘Also try’ feature, suggesting phrases that might be used to narrow the search. This is particularly useful where a search term is ambiguous. |
| Performance | Very good | Quick results, with generally good quality images. We spotted the occasional image that was no longer available, and a duplicate (same image, but on different sites with different filenames). It’s important to note that although large numbers of results are reported, Yahoo does not enable its users to view more than 1000 of them. |
| Presentation | Very good | Thumbnails first: a fixed 20 at a time which link to a split screen with Yahoo on top and the source page below. On the results page the user is given filename, pixel dimensions, filesize and location URL. The same information is given on the split screen view along with a copyright warning and the options of viewing the original image by itself or emailing to a friend. |
| Support | Good | Offers brief, but useful, online help. If this isn’t sufficient, the user is able to send a more specific query to Yahoo via a form. |
| TASI’s test words | Good | Found the TASI image, but not the BBC image. Provided good results for all of the other search terms. The results for the ambiguous terms were sometimes less varied than Google’s on the first few results pages, but here Yahoo’s ‘Also try’ prompt proved helpful. |
A9 (http://a9.com/) draws on Google for its image searching but reports significantly fewer results and offers no advanced search options. The results are displayed as thumbnails without any accompanying information, meaning the user has no clues about the size or context of the image until they click on it. Once clicked, however, a split-screen very similar to Google’s appears, with the thumbnail, URL, pixel dimensions, and a warning about copyright. Like Google, A9 will not display more than 1000 images.
AOL (http://search.aol.com/aolcom/imagehome and http://search.aol.co.uk/image_idx?) draws on Google for its image searching. The UK (.co.uk) version appears to be larger and more up-to-date than the main (.com) version, although neither reports as many results or offers as much functionality as the Google image search. Like Google and A9, both versions of the AOL image search cap their results at less than 1000. The main AOL image search does not offer any advanced options and provides minimal information about the images. When you click on a thumbnail, a page opens displaying the image at its full size and a frame containing the original source. No copyright warning is provided. The UK version offers advanced search options and more information about the images. When you click on a thumbnail you are shown the original image at full size, but given no copyright warning nor shown the original source of the image. The latter must be accessed separately via a tiny link at the bottom of the page.
AlltheWeb (http://www.alltheweb.com/?cat=img) draws on Yahoo for its image searching. Although it reports a slightly smaller numbers of total results than Yahoo, AlltheWeb’s retrievable results seem to be identical to Yahoo’s (although are capped at 1008 rather 1000). Where they differ is in their functionality and display. AlltheWeb does not offer Yahoo’s useful ‘Also try’ options, provides less information about the images, and a more limited advanced search. When you click on a thumbnail you open the original image by itself. When you click on the link below the thumbnail you go directly to the original web page. There is no split or framed page with additional metadata (Yahoo’s approach) and there is no warning about copyright - the only mention of this is made within AlltheWeb’s help FAQ.
AltaVista (http://www.altavista.com/image/) draws on Yahoo for its image searching. The total number of results reported by AltaVista sometimes varies from Yahoo, but the images retrieved are nearly identical (we noticed a couple of variations when running the test). There is no separate advanced search, but there are some filters available under the search prompt, which enable the user to limit results to certain sorts of images (photos, graphics or buttons), colour or black and white images, images from certain sources (e.g. Web, news), or images of a certain size. The results page provides thumbnail, filename, pixel dimensions and filesize, with a link to ‘more info’. This link adds colour information, the source URL and, usefully, an ‘abstract’ (descriptive text excerpted from the source Web page). It also lists other pages on which the same image might have been used. If, at the results page, the user clicks on the thumbnail rather than the ‘more info’ link, they are taken directly to the source page of the image. No copyright warnings are given with the results or on the help page.
Lycos (Europe) (http://www.lycos.co.uk/search/picture.html) draws on Picsearch (see below) for its image searching. Like Picsearch, it seems to enable users to view all the results (we gave up at 25,000), however its selection does not seem to be as current. Nor does it provide advanced search options, although the results can be filtered by size, colour or to include ‘erotic’ content (not really relevant, because Picsearch automatically excludes adult content). With its main search, Lycos provides very little information about the images (pixel dimensions, filesize and format, and Web domain), with the image linking directly to the original source. On an alternative ‘speed’ or ‘slim’ search interface (http://www.lycos.co.uk/search/slim/) more of the Picsearch information is available. Both of these interfaces carry advertising and neither provides a copyright warning or help with image searching.
Microsoft’s MSN Search (http://search.msn.com/images/) draws on Picsearch for its image searching. Like Picsearch, it seems to enable users to view all the results (we gave up at 25,000), however its selection does not seem to be as current. Like Lycos, it does not provide advanced search options, although it does enable results to be filtered by size or colour. Clicking on thumbnails produces a split screen display like Picsearch and other image search engines, although this includes less information than is available from Picsearch’s own display. Although MSN is currently relying on Picsearch it may be that they are developing their own image search engine (as Ask has). This is suggested by a patent Microsoft filed in late 2005 which includes an image search algorithm (see patent application).
In the 2003 review we looked at three specialised image search engines: Cobion visoo, Ditto and Picsearch. By the time of the 2004 review Cobion Visoo had gone and Cydral had appeared, although only offering very limited results and with much of its Web site in French. Picsearch scored the best of these three in TASI’s 2004 evaluation. This time (in 2006) we have revisited Cydral, Ditto and Picsearch. As mentioned above, Picsearch is now very active in providing image services to other search engines. It seems that Ditto is now also taking most or all of its image results from Picsearch - the images returned in our tests were similar to other results provided by Picsearch and a small notice has appeared in Ditto stating that ‘portions powered by Picsearch’. For this reason we present full reviews of Cydral and Picsearch, but only a brief overview of Ditto (although the full tests were run). We also note that both Cydral and Ditto now offer general Web searching in addition to their image search. Although this might seem to put them in the category above (general search engines), we have left them here because image searching is still their primary focus.
Cydral is a small image search engine. Although most of the interface is in English, some of the supporting pages are in French.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Unknown but small | Cydral provides no statement of its size or currency, but the results suggest it is fairly small. |
| Search Options | Limited | Cydral only offers a simple search box with a ‘family filter’ on/off switch. |
| Performance | OK | Results were returned quickly, but compared with the other engines tested they were small in number and generally of less quality and relevancy. Cydral will let you see up to 1280 images (80 screens of 16 images). |
| Presentation | Good, but not fully functioning when tested | Cydral presents 16 thumbnails first, with pixel dimensions and filesize. There are three links below the thumbnail: one entitled ‘more like this’ which didn’t return any results when we tested it; another entitled ‘source’, which links directly to the original Web page; and a third entitled ‘Info’ which produces a split screen presentation similar to Google or Yahoo’s, with Cydral information at the top of the window and the original Web site below. Cydral provides: image name, file type, resolution, filesize, and links directly to the image or the Web page on which it is found. There is a clear statement about copyright and no sign of advertising. |
| Support | Unhelpful | Help and contact information is only available in French. |
| TASI’s test words | Limited | None of the specific images were found and we only retrieved small sets of results for the others. Raphus cucullatus and Les Demoiselles d’Avignon yielded no results and the results from ‘Homer’ were almost entirely related to the Simpsons character. |
Picsearch is a Swedish company founded in 2000. As mentioned above, in addition to its own dedicated site, Picsearch now licenses its image search results to several other search engines. Picsearch is unique among the image search providers in: (a) limiting itself to images; (b) automatically excluding adult images from its database; and (c) enabling its users to view very large numbers of results (we gave up when we got to 25,000 images).
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Big and broad | We have found no statement of size or currency, but the results from our tests suggest that Picsearch is at least half the size of Yahoo and Google and probably more. They also suggest that it is a similar size to Ask and very much bigger than Cydral. Note, however, that unlike other image search providers, Picsearch seems to let you see all your results. |
| Search Options | Good | Offers both simple and good advanced searching. Does a Boolean ‘And’ search by default and supports truncation and exclusion of terms. Images can by limited by colour, animation and pixel dimensions. Adult material is automatically filtered out by Picsearch. |
| Performance | Very good | Results are fast and generally very good and relevant. We observed the occasional duplicate (different filenames or locations) and dead link. Although the declared results may be smaller than Google or Yahoo, Picsearch enables you to view many more of them, perhaps all. This year we gave up when we got to 25,000 (last year we persevered to 100,000 images on one search)! |
| Presentation | Very good | Picsearch presents 20 thumbnails first and then links to a split window with Picsearch on top and the source below. Information includes pixel dimensions, filesize, file type and colour info - including number of colours in the image. There is a clear statement about the need to obtain permission before using images and no sign of any advertising. Picsearch also includes a ‘Remove image’ link, which enables a user to flag an image as ‘missing’ or ‘offensive’ or a copyright owner to request that their image be removed. |
| Support | Good | Good online help available and users can contact Picsearch via email. |
| TASI’s test words | Good | Picsearch did not find the specific image files in TASI’s test, but produced very good results on the other searches, include a good mix of images for the ambiguous terms ‘Homer’ and ‘Queen’. |
Ditto (http://www.ditto.com/) now seems to draw on Picsearch for its image searching, but provides no more than 304 results (19 pages of 16 images) and offers no advanced search options. Results are displayed as thumbnails which link directly to the original source pages. The only other information provided about the image is its filesize. There is a notice at the bottom of each page advising users that they must obtain appropriate permission to use the images. Ditto’s copyright warning and its avoidance of the split-screen presentation favoured by other image search providers may be a consequence of a 2002 copyright suit brought against it for the way it framed images (see http://en.wikipedia.org/wiki/Kelly_v._Arriba_Soft_Corporation - note that Ditto was then called Arriba).
Although a meta-search engine, which submits search requests to more than one other search engine, might seem to promise better results, this is seldom the case - especially when searching for images.
Meta-search engines typically offer fewer search options than their source search engines and the results tend to be slower, smaller in number, and less reliable.
TASI evaluated ten meta-image search engines: eight of them are general meta-search engines offering an image search option; two are really search ‘spring-boards’ rather than true meta-searches - they only allow the user to submit a search request to one other search engine at a time.
None of the general meta-search engines performed very well in our tests.
Excite (http://www.excite.com/), Dogpile (http://www.dogpile.com/), Webcrawler (http://www.webcrawler.com/) and Metacrawler http://www.metacrawler.com/ are all provided with image meta-search results by Infospace (http://www.infospaceinc.com/). These four meta-search engines deliver the same images, all drawn from Yahoo and Ditto and seemingly capped at about 60 results. Although there are slight variations in their presentation, each offer essentially the same services and options, including an advanced image search and a useful means of dealing with ambiguities (‘Are you looking for…’) that is similar to Yahoo’s approach (‘Also try…’). Although each of these search engines provide an adult filter, this did not consistently block adult content in our tests.
Mamma (http://www.mamma.com/) used to draw its images from Picsearch, Yahoo and Google, presenting a small selection from each. Now it seems to be relying entirely on Picsearch - but delivering all the results rather than the small number it previously offered. The presentation of results is similar to Picsearch’s although the advanced search is simpler.
Fazzle (http://www.fazzle.com/) used to be called SearchOnline.info. It reports results from Picsearch, Webshots, Altvista and FAST. Altavista is now supplied by Yahoo and FAST no longer exists, so it might be that the results reported from these sources are old. We observed a number of dead links. Although Fazzle reports a large number of hits, it only presents the ‘top’ hundred or so images. It also seems to pull in its results unfiltered, even if the adult filter in the advanced search in turned on.
Ithaki (http://www.ithaki.net/) provides images from Yahoo, Picsearch, Freenet (a German site) and Arianna (Italian). There is only a simple search box and only a handful of images are delivered from each engine: we got no more than 50 results for our test searches.
Ixquick (http://www.ixquick.com/) does not say where its image results are drawn from (there seemed to be some images in common with Yahoo and Picsearch, but this may have been coincidental). Although it reports large numbers of results, it is not possible to view them all: one of the searches enabled us to view 2400 images, but others would only display a few hundred of the hundreds of thousands of results Ixquixck reported. Ixquick offers minimal information about the images, no advanced search, image filtering, or help.
These services are often classed as meta-search engines, but they are more like search ‘spring-boards’, since they only pass on the user’s search to one other search engine at a time (SearchEngineWatch term these ‘All-in-one’ search engines). Since they only offer very simple search options and are doing little more than passing a message on, they can seem like a waste of time and often are.
Photoseek (http://www.photoseek.net/) gives its users the option of searching Photos.com, ClipArt, Ditto, Excite, Lycos or AltaVista. It simply passes on the request to the selected search engine and then frames the results.
Search 22 Picture and Image Search Engines (http://www.search-22.com/downloads/images.php) offers a much better spring-board to image searching than Photoseek or any of the proper meta- search engines listed above. Its twenty-two image search engines are a mixed bag, but the key ones are included. Although Search 22 does not offer the sophistication available when these other engines are searched directly, it does enable the user to make a very quick check across a number of different services - perhaps useful in ascertaining which engines are worth searching in a more in-depth fashion.
While image search engines strive for comprehensiveness and try as best they can to automatically identify and index images, collection-based engines draw on a smaller pool of catalogued images, usually from a database and usually indexed by humans. Large digital libraries and commercial photo or clip art providers are good examples of these collection-based search engines, as are community-based photo collections. However it may be that the largest group of image collections on the Web are adult-oriented services.
Digital library collections tend to be limited to particular subjects or formats (e.g. historic regional photographs or art collections), while the many commercial stock photo collections aim for as broad a coverage as you would find in the independent image search engines evaluated above.
Because all of these collections are held in databases rather than embedded in Web pages, they tend to shut out the ‘spiders’ or ‘robots’ that independent search engines use to trawl and index the Internet. As a result, their images will seldom be found among general search engine results and are sometimes said to belong to the ‘hidden Web’.
While most commercial image collections are very happy to shut out search spiders, many owners of digital library collections would like to make their contents more easily discoverable. It seems likely that advances in metadata and Web-coding will improve this situation. Currently, any comprehensive search for images would need to include image search engines and the relevant collection-based engines. (For more on finding images, see TASI Advice Document: Finding Images Online.)
Apart from their scope, the other chief difference between collection-based search engines and general image search engines, are that the collection-based engines usually search on metadata entered by humans rather than automatically generated indexes that guess at their images’ contents. As a consequence, the results from a collection-based search are likely to be much more consistent and relevant than those delivered by automated engines.
TASI’s own database of Image Sites includes information on many image collections, particularly digital library collections. Since stock photo collections are aiming for the same kind of scope and comprehensiveness as general image search engines, we evaluate the two largest below (Corbis and Getty) to offer a comparison. We also consider WebShots, a very large collection built from a mixture of stock collections and personal online photo albums and Flickr, another collection of personal photo albums. All of these collections were searched as public users rather than registered users.
Founded by Microsoft’s Bill Gates in 1989, Corbis owns or licenses a vast collection of photography and fine art images. In 2003 Corbis was trying hard to reach the non-commercial user by selling its images through Yahoo and offering 2 separate interfaces to its own Web site: personal and professional. Now, in 2006, the Corbis site seems less concerned with personal use. The personal interface has gone, although there are subscriber-based educational collections available for North American users. We tested the main, professional interface which searches across creative and editorial collections.
Corbis
http://pro.corbis.com/
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Very broad | Extremely large and up-to-date collection. Corbis claim more than 70 million images, though it’s not clear how many are available to search. |
| Search Options | Best | Simple and advanced searches. The advanced search is the best TASI saw in this review, enabling quite sophisticated filtering (before and after the search) and full Boolean, phrase and truncated searching. Searches can be limited to specific categories or collections, by date added, point of view, orientation, resolution, type of image, and other criteria. |
| Performance | Very Good | Quick, numerous, relevant and high-quality results, as expected from a large stock photo database. Default is to receive 1000 results, but this can be increased to 10000 - a number exceeded by some of our searches. All the results returned can be viewed. |
| Presentation | Very good | Thumbnails first (up to 100 a page). Various file sizes are offered and the images are captioned and credited along with a date and keywords. There are clear statements of copyright and the larger images contain visible watermarks. |
| Support | Very good | Very full online help and the option of contacting Corbis by form or email. |
| TASI’s test words | Mixed | Unsurprisingly, none of the specific files among TASI’s test terms were found. There were no results for the Picasso painting and some mixed results for ‘Raphus cucullatus’ (Dodo) and ‘Homer’. For ‘Queen’, Corbis presents a ‘Term Clarification’ box, inviting you to narrow the search to chess piece, insect, playing card, rock band or royalty. No such box was offered with ‘Homer’. For generic categories like eagle and leaf, the images provided were superb. |
http://creative.gettyimages.com/
Getty Images was founded in 1995 by Mark Getty and Jonathan Klein and has grown to rival Corbis in the commercial image market. Over the past few years it has bought or formed partnerships with many large image collections, including the Bridgeman Art Library, Hulton Archive, ImageBank, National Geographic and the TimeLife collection. Many of Getty Images’ collections retain their own identities and often their own Web sites, but it is also possible to search across the collections via Getty Images’ Creative or Editorial search engines. We evaluated Getty’s Creative search engine, which is clearly pitched at the professional user.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Very broad | Extremely large and up-to-date collection serving millions of images (although exact statistics are unavailable). |
| Search Options | Very good | Simple and advanced options, including full Boolean, wildcard and phrase searching and a number of filtering options (e.g. orientation, photo type) available before or after a search. |
| Performance | Good | Generally very quick to search, although the number of hits was not always as high as expected. |
| Presentation | Good | 60 thumbnails per page. The detailed records include title, photographer and keywords, although the extent of the information varies from image to image depending on which collection it is drawn from. There is always a clear identification of copyright and rights status. |
| Support | Good | Good help is available online and more is available via a form or phone number. |
| TASI’s test words | Good | Unsurprisingly, none of the specific images were found. There were some good results on the general terms (eagle and leaf), but poor or mixed results on others. We did not find ‘Raphus cucullatus’ (Dodo) or Les Demoiselles d’Avignon and there were just three results for Chaucer. ‘Icons’, ‘Queen’, ‘Homer’ and ‘West Bank’ brought up search clarification boxes, although these did not include options for religious icons, the band Queen, Homer the poet, or Homer the TV character. We note, however, that our search was limited to Getty’s Creative search engine. We may have found more results for some of these via the Editorial search engine. |
There are many stock photo collections available online. TASI includes others in its advice paper on Finding Stock Images. Others can be found via the membership lists of the British Association of Picture Libraries and Agencies (BAPLA - http://www.bapla.org) and the Picture Archive Council of America (PACA - http://www.stockindustry.org/). There is also a very useful online directory called Stock Index Online (http://www.stockindexonline.com/).
Webshots is a community photo album with a commercial spin. Members store their own photographs online and these become available for others to search, view, and, in some cases, purchase.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Vast but eclectic | Claims to have more than 360 million photos. Collection includes individual contributions supplemented with stock photo collections, making for uneven and sometimes eclectic subject coverage. |
| Search Options | Average | Offers simple and advanced search interfaces, although the advanced is quite limited. Webshots excludes adult images from its collection. |
| Performance | Average | Some good results on some of the general terms, but poorer coverage of some of the more specific subjects we searched on. Webshots is searching on user captions, so this sometimes throws up odd results. |
| Presentation | Average | 20 thumbnails per page along with minimal additional information. Some Webshots images are available for purchase as screensaver images or as posters, others (the community contributions) can be viewed full-size, bookmarked as favourites, blogged, or sent to others as e-cards. Webshots includes a lot of advertising alongside its results. Its terms and conditions remind contributors and users about copyright restrictions, as does a copyright symbol under thumbnails on the results pages. |
| Support | Poor | Online help concentrates on the commercial and photosharing functions rather than the search. There is no easy way to ask questions about searching. |
| TASI’s test words | Average | Unsurprisingly, none of the specific images were found although there were some images of Les Demoiselles d’Avignon - with people standing beside it in the gallery! There were mixed results on the other search terms: some good results for the general terms (‘leaf’ and ‘eagle’) but poorer results for the others. |
Flickr is another community photo album. It was launched in 2004 and in early 2005 was purchased by Yahoo. Flickr has become something of a phenomenon, growing rapidly and receiving much attention because of its social approach to tagging and sharing images. Its tagging is often cited as a good example of a ‘folksonomy’ - cataloguing by users. It encourages users to share images via the Creative Commons licensing scheme (http://creativecommons.org/), and it tries to build community through commenting, public favourites and blooging.
| Criteria | Overall Evaluation | Comments |
|---|---|---|
| Scope | Large but eclectic | Flickr was reported as containing 5.5 million images when acquired by Yahoo in 2005. This will have grown significantly since then. Because the collection is based on individual contributions subject subject coverage in uneven and eclectic. |
| Search Options | Poor (but…) | Flickr only offers a simple search box, however it has many features to encourage browsing and linking (e.g. groups, personal albums, favourites and popular tags). |
| Performance | Average | Results are returned quickly and are of mixed quality. |
| Presentation | Average | 20 thumbnails per page with a link to the image (via the thumbnail) or the collection of the image contributor (Via their username). The full page for the image includes a caption and provides a context for the image: any of the user’s sets it belongs to, any public groups it’s included within, any tags or comments that have been applied, and anyone’s favourites collection that has it bookmarked. Flickr is non-commercial and has clear statements about copyright for those contributing or using images. It encourages users to use Creative Commons licenses. |
| Support | Average | Like Webshots, Flickr’s online help concentrates on photosharing functions rather than searching. In keeping with the social nature of Flickr, there are online forums where users can raise questions and share experiences. |
| TASI’s test words | Average | Unsurprisingly, none of the specific images were found. The generic terms drew large sets of results, although the images varied considerably in quality, and the user tagging system means that many of the images may not seem relevant to a particular search (e.g. a search on ‘Chaucer’ brings up a dog of that name). Flickr offers a ‘most interesting’ filter which does a good job of pulling out high quality images. |
All the image search engines considered so far in this review have been based on text and context strategies or on associated catalogue entries. There have also been a number of attempts to build content-based search engines. Content-Based Image Retrieval (CBIR) considers the characteristics of the image itself, for example its shapes and colours. To date, these attempts have been experimental and generally limited to individual collections.
Well-publicised efforts have included Columbia University’s WebSEEk project (http://persia.ee.columbia.edu:8008/) and IBM’s QBIC (Query by Image Content - http://wwwqbic.almaden.ibm.com/), which can be seen in action on the Hermitage Museum website (http://www.hermitagemuseum.org/). UK examples include the AHDS Visual Arts’ (http://ahds.ac.uk/visualarts/) and IMAGINE (http://www.imagine.org.uk/) collections. The wikipedia’s entry on CBIR maintains a good, current set of links (http://en.wikipedia.org/wiki/CBIR).
Some adult-site blockers use simple content analysis to filter out adult images, but these are not always effective, since they are based on screening out images with particular (flesh-coloured) tonal values.
Image search engines attempt to give access to the wide range of images available on the Internet.
For those used to viewing well-indexed collections of quality images, the results of the large automated image search engines will probably disappoint. The poor quality of their offerings is not surprising, since they reflect the randomness and unevenness of the Web. The frequent irrelevancy of their results is also explicable, since the automated engines are guessing at their images’ visual subject content using indirect textual clues.
Anything, then, that enables the user to have more control over their image searching is helpful. The ability to filter a search - to include and exclude items - is important in any Web searching, but particularly so when searching for images. Many users will wish to exclude adult imagery from their search results, but it can also be very useful to limit by file type, file size, or colour - and the ability to use Boolean logic or phrases will greatly improve the relevancy of the results. An image search engine also needs to return a reasonable number of results, since in any given search a fair proportion of the images found are likely to be irrelevant or of insufficient quality.
Here the meta- image search engines fail their users. They deny them sophisticated filtered searching and generally only bring back a handful of the results they find. A meta- or spring-board service like Search22 can be useful in identifying the best image search engine for a particular task, but the final searching is better done directly, using the source search engines themselves. Of these, we found that Google, Yahoo and Picsearch performed the best in our test. It will be interesting to see how Ask’s image search develops and whether MSN comes up with its own image search.
Collection-based image search engines include images selected for quality and indexed by hand. The images they contain are seldom found within the results of general Web search engines. Collection-based engines, then, will usually offer much better results than their search engine counterparts. The commercial and copyright issues will also be much clearer - although many users seem to prefer to look to image search engines for images they can ‘freely’ re-use, as if easy access and absent copyright notices lets them off the moral and legal hook. True copyright-free images are very rare on the Web.
The Internet is a fast-changing environment, so it is likely that some of the information in this review will quickly date. The reader is well advised to check the sites themselves and to ask the kin
The most important question, however, is not on our evaluation criteria and can only be answered by each individual user: ‘For what purpose do I want to find images?’ The answer to this question needs to inform the choice of search tools, and the choice of search terms you use.
For more advice on searching for images see TASI’s Advice Document: Finding Images Online.
Other sources of information include:
Search Engine Watch
http://www.searchenginewatch.com/
A site devoted to monitoring and evaluating search engine developments. Some information is subscription-only, but much is free
Search Engine Showdown
http://notess.com/search/
One librarian’s very useful Site on searching and search engines
Last reviewed May 2006