Content Based Retrieval in Extremely Large Visual Media Collections
This article gives an overview of content based retrieval research
How do you find a picture of a ballet dancer? of a painting by Rembrandt? of a red Chinese dragon? of the Fonz (ala Happy Days) being cool in his leather jacket and boots? of Cindy Crawford or Claudia Schiffer running on a beach? How does a graphic artist find material for a new collage? How does a clothing designer decide on all the different textures and fabrics between cotton, wool, lycra, canvas, latex, silk, rayon? How can you find a picture of Nixon shaking hands with Elvis?

Currently, about 80% of the information on the World Wide Web consists of images, yet there are few or no methods of finding this hidden treasure of information on the Web. Leiden University has been developing new methods of finding images and other media for the next generation Web search engines.

In content based search, it is often necessary to search extremely large image and video collections. These collections often have little or no annotation so purely text methods may be ineffectual. In the ImageScape project, we are striving to develop the technologies which will make very large visual media collections accessible to non-experts.

Our ongoing research includes methods which take into account important factors such as: (1) the sheer size of the databases - over 10 million images and videos; and most importantly, (2) query mechanisms which are intended for creative artists and nonexperts. This has already resulted in innovative data structures for compressing image databases, computationally efficient algorithms for finding similar images, and development of an intuitive icon based user interface. Methods for detecting visual concepts from images and video are also an essential aspect of this project.

Media Lab Overview
LIACS Homepage
MM Conf
ACM Multimedia
ACM ICMR
IAPR ICPR
Science Direct
IEEE Library
LIACS Publications
ACM Digital Library