Free help and advice to the UK Further and Higher Education community

Helpdesk

2004 Review of Image Search Engines

October 2004

Please note: This page is an earlier version of our current Review of Image Search Engines, which we have archived for reference purposes only. Links on this page are no longer being checked and may no longer work. Please refer to our current Review of Image Search Engines for the latest information.

  1. Introduction
  2. General, specialised and meta- search engines
  3. Collection-based search engines
  4. Content-based search engines
  5. Conclusion

1. Introduction

TASI conducted its first review of image search engines in early 2003 (available here). Now, eighteen months later, we repeat the exercise. Eighteen months is a long time in the world of search engines and during this period there have been some important changes among the bigger general search engines (chiefly, the replacement of AllTheWeb’s image search with Yahoo!‘s). These changes are discussed below. The information presented here is correct for October 2004: if you are reading this some time after that date, you are advised to check the sites carefully and to ask the kinds of questions TASI has included in its evaluation criteria (below).

The phrase ‘Image Search Engine’ is most often used of Web-based services that collect and index images from other sites on the Internet. Image searching is sometimes offered by general search engines, like Google or Yahoo!, but there are also specialised image search engines - services devoted to indexing images or multimedia. In addition, there are meta- search engines, which pass on search requests to more than one search engine and then bring back the results.

Sometimes ‘Image Search Engine’ is also used to refer to collection-based search engines - services that index a single or small number of image collections. Large digital libraries or commercial stock photo collections, like Corbis, typically offer their own search engine-like facilities.

All of these types of image search engines are text-based - their indexes are created from words associated with the images. In addition, there have been attempts to create content-based search engines, which ‘index’ visual characteristics of an image, such as its shape and colour. However, these attempts are still largely experimental and are often limited to single image collections.

This critical review concentrates on general, specialised, and meta- search engines (section 2), but also covers, briefly, collection-based and content-based engines (section 3 and section 4).

2. General, Specialised and Meta- Search Engines

Image search engines are based on existing search engine technology, but they use additional strategies to identify, categorise and rank images.

A search engine’s indexing of images is done automatically, rather than using human indexers, so it must find ways to guess at the image’s content. It might take into account its filename or any accompanying ‘ALT’ picture tags (these are coded into the HTML page). It might look for clues from the image’s context - for example, the words or phrases that are close to the image, or the ‘META’ tags found at the top of the HTML coding. The nature of the Web site and its provider may also be taken into account.

Analysis of an image’s text and context can be used to exclude images as well as include them - for example, an image engine will usually consider an image’s context and associated words when it is blocking out adult material.

In reviewing image search engines, it is helpful to use an evaluation criteria and a standard set of search terms. Although the exercise is still subjective, these will offer some basis for comparison.

 

Evaluation Criteria
In reviewing image search engines, TASI asked the following questions:
Scope
  • Size and scope - how big and comprehensive is the index? Is this clearly stated?
  • Currency - how often is the index updated? Is this clearly stated?
Search Options
  • Simple/Advanced - are simple and advanced search options available?
  • Sophistication - is it possible to conduct Boolean, phrase and wildcard or truncated searches?
  • Filtering - can searches be limited before and after the search?
  • Adult Material - can adult content be excluded?
Performance
  • Speed - are the results quickly returned?
  • Number of hits - do the number of results seem reasonable given the size and scope of the index? Is it actually possible to view all of the results (in many cases it’s not)?
  • Relevancy - how well do the results match the query?
  • Quality - how good are the images supplied?
  • Currency - are there any missing images or dead links?
  • Duplication - is there any obvious duplication of images?
Presentation
  • Thumbnails - are the results presented as thumbnails or just text links?
  • File information - do the results include any of the following useful information: file format, file size, image dimensions and colour information?
  • Other Metadata - do the results include any of the following: caption, filename, alt tag, page title or other associated text?
  • Copyright - is it clear who owns the copyright and how the images can be used? Is it possible to link to the original source of the image?
  • Adverts - is the page cluttered with adverts? Are the results skewed in favour of commercial images?
Support
  • Online Help - is there adequate help information available online?
  • Offline Help - is it possible to contact someone about the service?

 

Search Terms
TASI tested the image search engines using the following terms:
Category Terms and phrases Rationale
General search terms Elephant
Flower
General, relatively unambiguous, terms
Specific search terms Danaus plexippus [monarch butterfly]
Cleopatra
Scientific name to test the indexing and historic figure to test scope and relevancy
Ambiguous search terms West Bank
Alexander Pope
Madonna
To test the weighting of the indexes and their handling of phrases
Specific images cmy_01.gif
(an illustration from the TASI Web site:
http://www.tasi.ac.uk/
advice/creating/image.html
)

newaeflogo.jpg
(logo of the American Eagle Foundation:
http://www.eagles.org/)

sforza2.jpg
(Illustration from the British Library Web site:
http://www.bl.uk/collections/
treasures/sforza.html
)
Search on unique files to test the comprehensiveness of coverage.
We searched on the filenames themselves and on relevant keywords: “American Eagle Foundation logo” and “Sforza Hours” (with and without the quotation marks)

 

TASI’s review was carried out during the first two weeks of October, 2004. For each search engine reviewed we searched on all of the terms and tried to answer all of the questions - the results are presented here in a summary form for convenience. We have also given overall evaluations (poor, disappointing, good etc.). These are obviously subjective judgements, but are based on the more detailed evaluations and on a comparison of all the engines reviewed.

2.1. Major Search Engines offering an Image Search Option

In TASI’s 2003 review we looked at AllTheWeb, Altavista, Google and Lycos in this category, suggesting that Google and AllTheWeb offered the best image searches among the general search engines. Now, in late 2004, the only engines to consider are Yahoo! and Google. Yahoo! has acquired its own sophisticated search engine technology (via its subsidiary Overture) and now supplies image search functionality to Lycos (http://www.lycos.com/), Altavista (http://www.altavista.com/) and AllTheWeb (http://www.alltheweb.com/). Yahoo! owns the last two. We only review Google and Yahoo! here since none of the others offer any additional image search functionality and some of them offer considerably less. AOL Search (http://search.aol.com/) draws its image results from Google, and Ask Jeeves (http://www.ask.com/) uses PicSearch (reviewed in section 2.2 below). Hotbot (http://www.hotbot.com/) and MSN Search (http://search.msn.com/) do not offer true image search features: they simply enable users to limit their search results to ‘only pages with images’.

 

Google Image Search
http://images.google.com/

Google has long been the favourite search engine of search experts and general Web users. Since its launch in 1998, it has concentrated on providing the best search facilities, long resisting the temptation to turn itself into a full-blown portal, which proved the downfall of some of its rivals. However, by late 2004 it has accumulated a lot of portal-like extras, carefully tucked away behind its tidy front page. It also remains to be seen what effect its float on the stock exchange will have on its services. Although still the biggest and most popular search engine, it faces a growing threat from Yahoo!

Criteria Overall Evaluation Comments
Scope Broadest Very big and up-to-date. Claims to search 800 million images and 4 billion Web sites.
Search Options Very good Includes a simple search and a good advanced search. The latter offers Boolean and phrase searching, but does not support wildcards or truncation. Results can be limited by size, file type, colour or Web domain, with adult material filtered in a ‘strict’ or ‘moderate’ way. Once some results are retrieved, Google enables its users to search within them. It also, very usefully, enables users to limit the results to images that are large (typically full-screen or bigger), medium, or small (often tiny gif images).
Performance Good Quick and reasonably good results. We spotted a few dead links and a little duplication (same picture, but on different sites with a different filenames). It’s important to note that although large numbers of results are reported, Google will not actually allow its users to view more eight or nine hundred images.
Presentation Good Thumbnails first (a fixed 20 at a time) which link to a split screen with Google on top and the source page below. Information includes: filename, extension, pixel dimensions, filesize and the location URL. This is accompanied (in the split-screen view) with a simple copyright warning. Usefully, Google groups together like pictures from the same site, presenting one to the user and offering a link to ‘more’. This can make for a better set of results.
Support Good Good online help available and an email contact can be found - after a little hunting about.
TASI’s test words Good Found all of the specific files except the Sforza image and provided reasonably good results for the other terms. Offered a mix of singer and saint for ‘Madonna’, some good maps and photos of Israel for ‘West Bank’, and did especially well at representing ‘Alexander Pope’.

 

Yahoo! Image Search
http://images.search.yahoo.com/

Yahoo! began life as a Web directory and until February 2004 was drawing most of its Web site search results from Google. During 2003 and 2004, through some strategic purchases, Yahoo! established itself as a major search engine player. Its Web site and image results are now supplied by its subsidiary Overture and it offers some serious competition to Google.

Criteria Overall Evaluation Comments
Scope Broad Size is unstated, but the number of results recorded were generally less than a third of Google’s. However, note that neither engine actually lets you view the entire set of results.
Search Options Very good Includes a simple search and a good advanced search. This has the same functionality as Google’s: i.e. Boolean and phrase searching, but not wildcards/truncation. Results can be limited by size, file type, colour or Web domain, with an adult material filter turned ‘on’ or ‘off’. Once results are retrieved there are no options to further filter the search.
Performance Very good Results are quick, images are good and relevant, no dead links or duplicates were observed. It’s important to note that although large numbers of results are reported, Yahoo! does not actually allow its users to view more than about 1100 images.
Presentation Good Thumbnails first: a fixed 20 at a time, presented within a 115x115 pixel slide frame. These link to a split screen with Yahoo! on top and the source page below. Information includes: filename, extension, pixel dimensions, filesize and the location URL. This is accompanied (in the split-screen view) with a simple copyright warning.
Support Very good Offers good online help and the ability to contact Yahoo! by form.
TASI’s test words Good Found all of the specific files except the Sforza image and very good results for the other terms.

2.2. Specialised Image Search Engines

In the 2003 review we looked at 3 specialised image search engines: Cobion visoo, Ditto and Picsearch. Cobion Visoo has disapeared, but Cobian still offer an image search as part of the German search engine Dino (http://www.dino-online.de/). Cydral is a newcomer, but offers very limited results and much of its Web site is in French. Of these three specialised image search engines, Picsearch scored the best in TASI’s evaluation.

 

Cydral
http://en.cydral.com/

Cydral is a small image search engine. Although the main interface is in English, most of the supporting pages are in French.

Criteria Overall Evaluation Comments
Scope Unknown but very small No clear statement of size or currency, but seems fairly limited.
Search Options Limited Simple search box with a Family Filter on/off switch.
Performance Poor Sluggish, with very small sets of results delivered.
Presentation Good Similar to Google and Yahoo!: 16 thumbnails first and filename that links to a split window with Cydral on top and the source below. Information includes pixel dimensions, file type and size. There is a clear statement about copyright and no sign of advertising.
Support Unhelpful Help and contact information is available in French.
TASI’s test words Limited None of the specific images were found and small sets of results for the others. Danaus plexippus yielded no butterfly pictures, but a lot of results with ‘dana’ in the filename.

 

Ditto.com
http://www.ditto.com/

Ditto is US-based and has been around for a few years, formerly under the name Arriba. In 2002 they were taken to court for copyright infringement and are now much more careful about the way they link to other people’s images (they take you directly to the source, without framing as other image search engines still tend to do). Their index is created through a mix of automatic and human editing.

Criteria Overall Evaluation Comments
Scope Unknown No clear statement of size or currency.
Search Options Poor Only a simple search available, with no filtering or evidence of sophisticated search support. Ditto automatically filters out adult images.
Performance Average Slow and small number of hits, although reasonably good and relevant. However, there were some dead links and duplication (albeit of pictures rather than files).
Presentation Poor Thumbnails only, 9 to a page, which link directly to the original source. Includes a lot of advertising, some of which resembles the results. The only information given about an image is its filesize - apart from some of the commercial images, which add a title. There is a copyright disclaimer at the bottom of each page and the thumbnails link to new windows - both consequences of their law suit.
Support Average Limited online help available, but users can contact Ditto via a form.
TASI’s test words Average Ditto failed to find any of the specific files, and gave mixed results for the others.

 

Picsearch
http://www.picsearch.com/

Picsearch is owned by a Swedish company. In addition to its own dedicated site, it licenses its image search to Ask Jeeves (http://www.ask.com/).

Criteria Overall Evaluation Comments
Scope Unknown No clear statement of size or currency - larger than Ditto, but very much smaller than Google or Yahoo!
Search Options Good Offers both simple and (moderately) advanced searching. Does a Boolean ‘And’ search by default and supports truncation and exclusion of terms. Images can by limited by colour, animation and pixel dimensions. Adult material is automatically excluded
Performance Good Results are generally very good and relevant. No duplication or dead links were observed. The declared results may be smaller than Google or Yahoo! but you can actually view many more of them. We were able to look at up to 100,000 images on one search (for ‘Flower’)!
Presentation Good Similar to Google and Yahoo!: 16 thumbnails first and then links to a split window with Picsearch on top and the source below. Information includes pixel dimensions, filesize, file type and colour info - including number of colours in the image. There is a clear statement about copyright and no sign of advertising.
Support Good Good online help available and users can contact Picsearch via email.
TASI’s test words Good Found the American Eagle Foundation logo once the .jpg file extension was removed, but none of the other specific files. The other search terms produced very good results.

2.3. Meta- Image Search Engines

Although a meta-search engine, which submits search requests to more than one other search engine, might seem to promise better results, this is seldom the case - especially when searching for images.

Meta-search engines will generally offer fewer search options than their source search engines and the results tend to be slower, smaller in number, and less reliable.

TASI evaluated ten meta-image search engines: eight of them are general meta-search engines offering an image search option; two are really search ‘spring-boards’ rather than true meta-searches -  they only allow the user to submit a search request to one other search engine at a time.

General meta- search engines

None of the general meta-engines tested performed very well.

Both Excite (http://www.excite.com/) and Dogpile (http://www.dogpile.com/) draw their image results from FAST (i.e. Yahoo!, with adult filter switched on) and Ditto. All of the thumbnails link directly to the original source. There are no advanced search options available and the delivery from the other search engines is sometimes patchy.

Webcrawler (http://www.webcrawler.com/) and Metacrawler http://www.metacrawler.com/) are very similar (along with Dogpile, they are owned by InfoSpace Inc). Webcrawler claims to source its images from both Yahoo! and Ditto, but the results we saw all seemed to come from Yahoo! However, Metacrawler did have some results from Ditto. Both of these meta-engines offer advanced search options, including colour, format, size and a mature content filter. They also provide two ways to display the results, by source or relevence. This feature seems to be more useful with their general Web site search: when searching for images the relevance search option serves to drastically cut the number of results.

Mamma (http://www.mamma.com/) draws on Picsearch, Lycos (i.e. Yahoo!) and Google for its image search, presenting a combined total of 34 results and no option of getting any more. There are no advanced options and the images seem to be taken from the other engines with the mature content filtering switched off!

Fazzle (http://www.fazzle.com/) used to be called SearchOnline.info. It sports some advanced searching options but its presentation of results is poor. These are drawn from Lycos, FAST and Altavista (i.e. all Yahoo!), Picsearch, and Webshots with some advertising thrown in. Although it records the total number of hits, Fazzle only presents the ‘top’ fifty or so images. Like Mamma, it seems to pull in the results unfiltered, even if you switch on the adult filter in the advanced search.

Ithaki (http://www.ithaki.net/) claims to search Cydral, Yahoo, Picsearch, Google, Dino, and Alltheweb, although we only saw results from the first three. Ithaki only offers a simple search box, although this supports some Boolean operators. Only a handful of images are delivered from each engine: we could get no more than 52 results for any search.

Ixquick (http://www.ixquick.com/) only currently searches GoGraph and Picsearch. Although it reports large numbers of results, it is only possible to view a small selection of them (up to 10 from each engine).

‘Spring-board’ searching

These services are often classed as meta-search engines, but they are more like search ‘spring-boards’, since they only pass on the user’s search to one other search engine at a time (SearchEngineWatch term these ‘All-in-one’ search engines). Since they only offer very simple search options and are doing little more than passing a message on, they can seem like a waste of time and often are.

Photoseek (http://www.photoseek.net/) gives its users the option of searching Photos.com, ArtToday (no longer functioning), Ditto, Excite, Lycos or AltaVista. It simply passes on the request to the selected search engine and then frames the results.

Search 22 Picture and Image Search Engines (http://www.search-22.com/images.html) offers a much better spring-board to image searching than Photoseek or any of the proper meta- search engines listed above. Its twenty-two image search engines are a mixed bag, but it includes the key ones. Although Search 22 does not offer the sophistication available when these other engines are searched directly, it does enable the user to make a very quick check across a number of different services - perhaps useful in ascertaining which engines are worth searching in a more in-depth fashion.

3. Collection-based Image Search Engines

While image search engines strive for comprehensiveness and try as best they can to automatically identify and index images, collection-based engines draw on a smaller pool of catalogued images, usually from a database and usually indexed by humans. Large digital libraries and commercial photo or clip art providers are good examples of these collection-based search engines, though the largest group on the Web are probably adult-oriented services.

Digital library collections tend to be limited to particular subjects or formats (e.g. historic regional photographs or art collections), while the commercial stock photo collections aim for as broad a coverage as you would find in the independent image search engines evaluated above.

Because all of these collections are held in databases rather than embedded in Web pages, they tend to shut out the ‘spiders’ or ‘robots’ that independent search engines use to trawl and index the Internet. As a result, their images will seldom be found among general search engine results.

While most commercial image collections are very happy to shut out search spiders, many owners of digital library collections would like to make their contents more easily discoverable. It seems likely that advances in metadata and Web-coding will improve this situation. Currently, any comprehensive search for images would need to include image search engines and the relevant collection-based engines. (For more on finding images, see TASI Advice Document: Finding Images Online.)

Apart from their scope, the other chief difference between collection-based search engines and general image search engines, are that the collection-based engines usually search on metadata entered by humans rather than automatically generated indexes that guess at their images’ contents. Consequently, the results from a collection-based search are likely to be much more consistent and relevant than those delivered by automated engines.

TASI’s own database of Image Sites includes information on many image collections, particularly digital library collections. Since stock photo collections are aiming for same kind of scope and comprehensiveness as general image search engines, we evaluate the two largest below (Corbis and Getty) to offer a comparison. In addition, we consider WebShots, a very large collection built from a mixture of stock collections and personal online photo albums. Last year we also reviewed StopStock (http://www.1stopstock.com/), a meta- search engine for stock photo collections. This has been dropped as it now only searches a handful of collections. Note that Corbis, Getty and Webshots are all searched as public - rather than registered - users.

Corbis
http://www.corbis.com/

Founded by Microsoft’s Bill Gates in 1989, Corbis owns or licenses a vast collection of photography and fine art images. In 2003 Corbis was trying hard to reach the non-commercial user, by selling its images through Yahoo! and offering 2 seperate interfaces to its own Web site: personal and professional. Now, in October 2004, the Corbis site is less concerned with personal use - although there is a small link to ‘Other licensing options’. This leads to another site offering cheap or subscription-based images for use on personal Web sites or in presentations. We review both the main, professional site and the subscription site here.

Main Corbis Professional Search Engine
http://pro.corbis.com/

Criteria Overall Evaluation Comments
Scope Very broad Extremely large and up-to-date collection. Corbis claim 70 million images, though it’s not clear how many are available to search.
Search Options Best Simple and advanced searches. The advanced search is the best TASI saw in this review, enabling quite sophisticated filtering (before and after the search) and full Boolean, phrase and truncated searching. Searches can be limited to specific categories or collections, by date added, point of view, orientation, resolution, type of image, and other criteria.
Performance Very Good Quick, numerous, relevant and high-quality results, as expected from a large stock photo database. Default is to receive 1000 results, but this can be increased to 10000 - a number exceeded by some of our searches. All the results returned can be viewed.
Presentation Good Thumbnails first (up to 100 a page). Various file sizes are offered and the images are captioned and credited along with a date and keywords. There are clear statements of copyright and the larger images contain visible watermarks.
Support Good Very full online help and the option of contacting Corbis by form or email.
TASI’s test words Good Unsurprisingly, none of TASI’s specific files were found and there were poor results on the historic figure ‘Alexander Pope’. All the other terms tested produced very good results. For Madonna, the user is prompted to narrow the search to the singer or the Virgin Mary.

 

Corbis Personal Search Engine
http://subscriptions.corbis.com/

Criteria Overall Evaluation Comments
Scope Unknown Not stated, but clearly a much smaller and inferior collection to the one the professional engine searches (above).
Search Options Average Offers simple and advanced searching, although with much less sophistication than the professional search engine. No filtering or refinement is offered after the search.
Performance Good Results of a reasonable quality, clearly geared toward presentations. Users can take a subscription or buy one offs in 640x480 or 1280x1024 pixel resolutions. Copyright is clearly stated and the larger images contain visible watermarks.
Presentation Good Thumbnails first with links to larger images and online commerce facilities.
Support Good Good online help and the ability to contact via a form.
TASI’s test words Average Unsurprisingly, none of the specific files found. Results from some of the other terms were only slightly better than those found in some Web-indexing image search engines. None were recorded for West Bank or Alexander Pope, nor any of the singer Madonna.

 

Getty Images Creative
http://creative.gettyimages.com/

Getty Images was founded in 1995 by Mark Getty and Jonathan Klein and has grown to rival Corbis in the commercial image market. Over the past few years it has bought or formed partnerships with many large image collections, including the Bridgeman Art Library, Hulton Archive, ImageBank, National Geographic and the TimeLife collection. Many of Getty Images’ collections retain their own identities and often their own Web sites, but it is also possible to search across the collections via Getty Images’ Creative or Editorial search engines. We evaluated Getty’s Creative search engine, which is clearly pitched at the professional user.

Criteria Overall Evaluation Comments
Scope Very broad Extremely large and up-to-date collection serving millions of images (although exact statistics are unavailable).
Search Options Very good Simple and advanced options, including full Boolean, wildcard and phrase searching and a number of filtering options (e.g. orientation, photo type) available before or after a search.
Performance Good Generally very quick to search, although the number of hits was not always as high as expected, nor the quality as good.
Presentation Good Can view from 15-30 thumbnails per page. The detailed records include title, photographer and keywords, although the extent of the information varies from image to image depending on which collection it is drawn from. There is always a clear identification of copyright and rights status.
Support Good Good help is available online and more is available via a form or phone number.
TASI’s test words Good Unsurprisingly, none of the specific images were found. There were some good results on the general terms (elephant and flower), but poorer results on others - e.g. Alexander Pope or Cleopatra. We note that ‘Madonna’ and ‘West Bank’ scored poorly on Getty’s Creative search engine, but would obviously have achieved different results on an Editorial search.

 

Webshots
http://www.webshots.com/

Webshots is a community photo album with a commercial spin. Members store their own photographs online and these become available for others to search, view, and, in some cases, purchase.

Criteria Overall Evaluation Comments
Scope Vast but eclectic Claims 105 million photos (10 times that of February 2003) and growing at the rate of 500,000 per day! Collection includes individual contributions supplemented with stock photo collections, making for uneven and sometimes eclectic subject coverage.
Search Options Average Offers simple and advanced search interfaces, although the advanced is quite limited. Webshots excludes adult images from its collection.
Performance Average Some good results on some of the general terms, but poorer coverage of some of the more specific subjects we searched on.
Presentation Average 12 thumbnails per page along with minimal additional information (title, pixel dimensions and ownership/copyright information). Some Webshots images are available for purchase as screensaver images or as posters, others (the community contributions) can be viewed full-size or sent as e-cards.
Support Poor Online help concentrates on the commercial and photosharing functions rather than the search. There is no easy way to ask questions about searching.
TASI’s test words Average Unsurprisingly, none of the specific images were found and there were mixed results on the other search terms - some very good results for general terms (‘elephant’ and ‘flower’), but poorer results for some of the specific terms (‘West Bank’ returned mostly archaeological artefacts).

 

There are many stock photo collections available online. Here is a further selection:

Bridgeman Art Library
http://www.bridgeman.co.uk/
Mary Evans Picture Library
http://www.maryevans.com/
Clipart.com (was ArtToday)
http://www.clipart.com/
National Geographic Images
http://www.ngsimages.com/
Christie’s Images
http://www.christiesimages.com/
nonstock
http://www.nonstock.com/
Comstock
http://www.comstock.com/
Photos.com
http://www.photos.com/
eStock
http://estockphoto.com/
Photos To Go
http://www.photostogo.com/
Index Stock Imagery
http://www.indexstock.com/
Photospin
http://www.photospin.com/
Magnum Photos
http://www.magnumphotos.com/
Stockbyte
http://www.stockbyte.com/
Time and Life Pictures
http://www.timelifepictures.com/

More can be found via the membership lists of the British Association of Picture Libraries and Agencies (BAPLA - http://www.bapla.org) and the Picture Archive Council of America (PACA - http://www.stockindustry.org/). There is also a very useful online directory called Stock Index Online (http://www.stockindexonline.com/).

4. Content-based Search Engines

All the image search engines considered so far in this review have been based on text and context strategies or on associated catalogue entries. There have also been a number of attempts to build content-based search engines. Content-based image retrieval (CBIR) considers the characteristics of the image itself, for example its shapes and colours. To date, these attempts have been experimental and generally limited to individual collections.

Well publicised efforts have included Columbia University’s WebSEEk project (http://persia.ee.columbia.edu:8008/) and IBM’s QBIC (Query by Image Content - http://wwwqbic.almaden.ibm.com/), which can be seen in action on the Hermitage Museum website (http://www.hermitagemuseum.org/).

The Institute for Image Data Research (IIDR) at Northumbria University (http://www.unn.ac.uk/iidr/) is currently working on a number of content-based retrieval projects and has a good set of links to other research programmes.

Some adult-site blockers use simple content analysis to filter out adult images, but these are not always effective, since they are based on screening out images with particular (flesh-coloured) tonal values.

5. Conclusion

Image search engines attempt to give access to the wide range of images available on the Internet.

For those used to viewing well-indexed collections of quality images, the results of the large automated image search engines will probably disappoint. The poor quality of their offerings is not surprising, since they reflect the randomness and unevenness of the Web. The frequent irrelevancy of their results is also explicable, since the automated engines are guessing at their images’ visual subject content using indirect textual clues.

Anything, then, that enables the user to have more control over their image searching is helpful. The ability to filter a search - to include and exclude items - is important in any Web searching, but particularly so when searching for images. Many users will wish to exclude adult imagery from their search results, but it can also be very useful to limit by file type, file size, or colour - and the ability to use Boolean logic or phrases will greatly improve the relevancy of the results. An image search engine also needs to return a reasonable number of results, since in any given search a fair proportion of the images found are likely to be irrelevant or of insufficient quality.

Here the meta- image search engines fail their users. They deny them sophisticated filtered searching and generally only bring back a handful of the results they find. A meta- or spring-board service like Search22 can be useful in identifying the best image search engine for a particular task, but the final searching is better done directly, using the source search engines themselves.

Collection-based image search engines include images selected for quality and indexed by hand. The images they contain are seldom found within the results of general Web search engines. Collection-based engines, then, will usually offer much better results than their search engine counterparts. The commercial and copyright issues will also be much clearer - although many users seem to prefer to look to image search engines for images they can ‘freely’ re-use, as if easy access and absent copyright notices lets them off the moral and legal hook. True copyright-free images are rare on the Web, and many image search engines do operate on commercial imperatives, even subtlety skewing their results towards commercial ends.

The Web is a fast-changing environment, so it is likely that some of the information in this review will quickly date. The reader is well advised to check the sites themselves and to ask the kind of questions listed in the evaluation criteria, above. If you spot errors or discover search engines we’ve missed, please let us know at info@tasi.ac.uk.

The most important question, however, is not on our evaluation criteria and can only be answered by each individual user: ‘For what purpose do I want to find images?’ The answer to this question needs to inform the choice of search tools, and the choice of search terms you use.

For more advice on searching for images see TASI’s Advice Document: Finding Images Online.

Other sources of information include:

Search Engine Watch
http://www.searchenginewatch.com/
A site devoted to monitoring and evaluating search engine developments. Some information is subscription-only, but much is free

Search Engine Showdown
http://notess.com/search/
One librarian’s very useful Site on searching and search engines

Last reviewed October 2004

Training courses