We present a generic model for multimodal information retrieval, leveraging di↵erent information sources to improve the e↵ectiveness of a retrieval system. The proposed method is able to take into account both explicit and latent semantics present in the data and can be used to answer complex queries, not currently answerable neither by document retrieval systems, nor by semantic web systems. By providing a hybrid approach combining IR and structured search techniques, we prepare a framework applicable to multimodal data collections. To test its e↵ectiveness, we instantiate the model for an image retrieval task.
We present a model for multimodal information retrieval, leveraging different information sources to improve the effectiveness of a retrieval system. This method takes into account multifaceted IR in addition to the semantic relations present in data objects, which can be used to answer complex queries, combining similarity and semantic search. By providing a graph data structure and utilizing hybrid search in addition to structured search techniques, we take advantage of relations in data to improve retrieval. We tested the model with ImageCLEF 2011 Wikipedia collection, as a multimodal benchmark data collection, for an image retrieval task.
We propose an image-based class retrieval system for ancient Roman Republican coins that can be instrumental in various archaeological applications such as museums, Numismatics study, and even online auctions websites. For such applications, the aim is not only classification of a given coin, but also the retrieval of its information from standard reference book. Such classification and information retrieval is performed by our proposed system via a user friendly graphical user interface (GUI). The query coin image gets matched with exemplar images of each coin class stored in the database. The retrieved coin classes are then displayed in the GUI along with their descriptions from a reference book. However, it is highly impractical to match a query image with each of the class exemplar images as there are 10 exemplar images for each of the 60 coin classes. Similarly, displaying all the retrieved coin classes and their respective information in the GUI will cause user inconvenience. Consequently, to avoid such brute-force matching, we incrementally vary the number of matches per class to find the least matches attaining the maximum classification accuracy. In a similar manner, we also extend the search space for coin class to find the minimal number of retrieved classes that achieve maximum classification accuracy. On the current dataset, our system successfully attains a classification accuracy of 99% for five matches per class such that the top ten retrieved classes are considered. As a result, the computational complexity is reduced by matching the query image with only half of the exemplar images per class. In addition, displaying the top 10 retrieved classes is far more convenient than displaying all 60 classes.
This paper is concerned with potential recall in multimodal information retrieval in graph-based models. We provide a framework to leverage individuality and combination of features of different modalities through our formulation of faceted search. We employ a potential recall analysis on a test collection to gain insight on the corpus and further highlight the role of multiple facets, relations between the objects, and semantic links in recall improvement. We conduct the experiments on a multimodal dataset containing approximately 400,000 documents and images. We demonstrate that leveraging multiple facets increases most notably the recall for very hard topics by up to 316%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.