Tags associated with social images are valuable information source for superior image search and retrieval experiences. Although various heuristics are valuable to boost tag-based search for images, there is a lack of general framework to study the impact of these heuristics. Specifically, the task of ranking images matching a given tag query based on their associated tags in descending order of relevance has not been well studied. In this article, we take the first step to propose a generic, flexible, and extensible framework for this task and exploit it for a systematic and comprehensive empirical evaluation of various methods for ranking images. To this end, we identified five orthogonal dimensions to quantify the matching score between a tagged image and a tag query. These five dimensions are: (i) tag relatedness to measure the degree of effectiveness of a tag describing the tagged image; (ii) tag discrimination to quantify the degree of discrimination of a tag with respect to the entire tagged image collection; (iii) tag length normalization analogous to document length normalization in web search; (iv) tag-query matching model for the matching score computation between an image tag and a query tag; and (v) query model for tag query rewriting. For each dimension, we identify a few implementations and evaluate their impact on NUS-WIDE dataset, the largest humanannotated dataset consisting of more than 269K tagged images from Flickr. We evaluated 81 single-tag queries and 443 multi-tag queries over 288 search methods and systematically compare their performances using standard metrics including Precision at top-K, Mean Average Precision (MAP), Recall, and Normalized Discounted Cumulative Gain (NDCG).
This paper tackles the challenge of forensic medical image matching (FMIM) using deep neural networks (DNNs). FMIM is a particular case of content-based image retrieval (CBIR). The main challenge in FMIM compared to the general case of CBIR, is that the subject to whom a query image belongs may be affected by aging and progressive degenerative disorders, making it difficult to match data on a subject level. CBIR with DNNs is generally solved by minimizing a ranking loss, such as Triplet loss (TL), computed on image representations extracted by a DNN from the original data. TL, in particular, operates on triplets: anchor, positive (similar to anchor) and negative (dissimilar to anchor). Although TL has been shown to perform well in many CBIR tasks, it still has limitations, which we identify and analyze in this work. In this paper, we introduce (i) the AdaTriplet loss -an extension of TL whose gradients adapt to different difficulty levels of negative samples, and (ii) the AutoMargin method -a technique to adjust hyperparameters of margin-based losses such as TL and our proposed loss dynamically. Our results are evaluated on two large-scale benchmarks for FMIM based on the Osteoarthritis Initiative and Chest X-ray-14 datasets. The codes allowing replication of this study have been made publicly available at https://github.com/Oulu-IMEDS/AdaTriplet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.