Anjali Thakrar scite author profile

Anjali Thakrar

1Publication

0Citation Statements Received

29Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of California, Berkeley

Publications

Order By: Most citations

Semantic Retrieval of Similar Radiological Images using Vision Transformers

Thakrar

Jayasuriya

Serapio

et al. 2023

Preprint

View full text Add to dashboard Cite

Background Identifying visually and semantically similar radiological images in a database can facilitate the creation of decision support tools, teaching files, and research cohorts. Existing content-based image retrieval tools are often limited to searching by pixel-wise difference or vector distance of model predictions. Vision transformers (ViT) use attention to simultaneously take into account radiological diagnosis and visual appearance. Purpose We aim to develop a ViT-based image retrieval framework and evaluate the algorithm on NIH Chest Radiographs (CXR) and NLST Chest CTs. Materials and Methods The model was trained on 112,120 CXR and 111,955 CT images. For CXR, a ViT binary classifier was trained on 4 ground truth labels (Cardiomegaly, Opacity, Emphysema, No Finding) and ensembled to produce multilabel classifications for each CXR. For CT, a regression model was trained to minimize L1 loss on the continuous ground truth labels of patient weight. The ViT image embedding layer was treated as a global image descriptor, using the L2 distance between descriptors as a similarity measure. To qualitatively evaluate the model, five radiologists performed a reader performance study with random query images (25 CT, 25 CXR). For each image, they chose the 5 most similar images from a set of 10 images (the 5 closest and 5 furthest images from the query in model space). Inter-radiologist and radiologist-model agreement statistics were calculated. Results The CXR model achieved nDCG@5 of 0.73 (p<0.001) and Cardiomegaly mAP@5 of 0.76 (p<0.001) among other results. The CT model achieved nDCG of 16.85 (p<0.001). The model prediction agreed with radiologist consensus on 86% of CXR samples and 79.2% of CT samples. Inter-radiologist Fleiss Kappa of 0.51 and radiologist-consensus-to-model Cohen's Kappa of 0.65 were observed. A t-SNE of the CT model latent space was generated to validate similar image clustering. Conclusion Our ViT architecture retrieved visually and semantically similar radiological images.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anjali Thakrar

Semantic Retrieval of Similar Radiological Images using Vision Transformers

Contact Info

Product

Resources

About