Abstract
Image retrieval systems, which compare the query image exhaustively with each individual image in the database, are not scalable to large databases. A scalable search system should ensure that the search time does not increase linearly with the number of images in the database. We present a clustering based indexing technique, where the images in the database are grouped into clusters of images, with similar color content using a hierarchical clustering algorithm. At search time, the query image is not compared with all the images in the database, but only with a small subset. Experiments show that this clustering-based approach offers a superior response time with high retrieval accuracy. Experiments with different database sizes indicate that for a given retrieval accuracy, the search time does not increase linearly with the database size.