Abstract:The large volume of biomedical literature poses a serious problem for medical professionals, who are often struggling to keep current with it. At the same time, many health providers consider knowledge of the latest literature in their field a key component for successful clinical practice. In this work, we introduce two systems designed to help retrieving medical literature. Both receive a long, discursive clinical note as input query, and return highly relevant literature that could be used in support of cli… Show more
“…(1) Local context: first, an initial retrieval run is performed to build a list of ranked documents in response to the user's query. Most of the proposed QE methods are based on traditional document ranking models such as the language model [38,115,153,181,198] and the probabilistic model [151,201]. Then, terms contained in the top-ranked documents are extracted using a blind or pseudo-relevance feedback approach.…”
The explosive growth and widespread accessibility of medical information on the Internet have led to a surge of research activity in a wide range of scientific communities including health informatics and information retrieval (IR). One of the common concerns of this research, across these disciplines, is how to design either clinical decision support systems or medical search engines capable of providing adequate support for both novices (e.g., patients and their next-of-kin) and experts (e.g., physicians, clinicians) tackling complex tasks (e.g., search for diagnosis, search for a treatment). However, despite the significant multi-disciplinary research advances, current medical search systems exhibit low levels of performance. This survey provides an overview of the state of the art in the disciplines of IR and health informatics, and bridging these disciplines shows how semantic search techniques can facilitate medical IR. First,we will give a broad picture of semantic search and medical IR and then highlight the major scientific challenges. Second, focusing on the semantic gap challenge, we will discuss representative state-of-the-art work related to feature-based as well as semantic-based representation and matching models that support medical search systems. In addition to seminal works, we will present recent works that rely on research advancements in deep learning. Third, we make a thorough cross-model analysis and provide some findings and lessons learned. Finally, we discuss some open issues and possible promising directions for future research trends.
“…(1) Local context: first, an initial retrieval run is performed to build a list of ranked documents in response to the user's query. Most of the proposed QE methods are based on traditional document ranking models such as the language model [38,115,153,181,198] and the probabilistic model [151,201]. Then, terms contained in the top-ranked documents are extracted using a blind or pseudo-relevance feedback approach.…”
The explosive growth and widespread accessibility of medical information on the Internet have led to a surge of research activity in a wide range of scientific communities including health informatics and information retrieval (IR). One of the common concerns of this research, across these disciplines, is how to design either clinical decision support systems or medical search engines capable of providing adequate support for both novices (e.g., patients and their next-of-kin) and experts (e.g., physicians, clinicians) tackling complex tasks (e.g., search for diagnosis, search for a treatment). However, despite the significant multi-disciplinary research advances, current medical search systems exhibit low levels of performance. This survey provides an overview of the state of the art in the disciplines of IR and health informatics, and bridging these disciplines shows how semantic search techniques can facilitate medical IR. First,we will give a broad picture of semantic search and medical IR and then highlight the major scientific challenges. Second, focusing on the semantic gap challenge, we will discuss representative state-of-the-art work related to feature-based as well as semantic-based representation and matching models that support medical search systems. In addition to seminal works, we will present recent works that rely on research advancements in deep learning. Third, we make a thorough cross-model analysis and provide some findings and lessons learned. Finally, we discuss some open issues and possible promising directions for future research trends.
“…Ad-hoc document retrieval (of both scientific articles and general domain documents) has been long-studied (Lalmas and Tombros, 2007;Hersh and Voorhees, 2009;Lin, 2008;Medlar et al, 2016;Sorkhei et al, 2017;Huang et al, 2019;Hofstätter et al, 2020;Nogueira et al, 2020b). Most recent work for scientific literature retrieval has focused on tasks such as collaborative filtering (Chen and Lee, 2018), citation recommendation (Nogueira et al, 2020a), and clinical decision support (Soldaini et al, 2017).…”
With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus. Clinicians, researchers, and policymakers need to be able to search these articles effectively. In this work, we present a zeroshot ranking algorithm that adapts to COVIDrelated scientific literature. Our approach filters training data from another collection down to medical-related queries, uses a neural reranking model pre-trained on scientific text (SciBERT), and filters the target document collection. This approach ranks top among zeroshot methods on the TREC COVID Round 1 leaderboard, and exhibits a P@5 of 0.80 and an nDCG@10 of 0.68 when evaluated on both Round 1 and 2 judgments. Despite not relying on TREC-COVID data, our method outperforms models that do. As one of the first search methods to thoroughly evaluate COVID-19 search, we hope that this serves as a strong baseline and helps in the global crisis.
“…Ad-hoc document retrieval (of both scientific articles and general domain documents) has been long-studied (Lalmas and Tombros, 2007;Hersh and Voorhees, 2009;Lin, 2008;Medlar et al, 2016;Sorkhei et al, 2017;Huang et al, 2019;Hofstätter et al, 2020;Nogueira et al, 2020b). Most recent work for scientific literature retrieval has focused on tasks such as collaborative filtering (Chen and Lee, 2018), citation recommendation (Nogueira et al, 2020a), and clinical decision support (Soldaini et al, 2017).…”
With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus. Clinicians, researchers, and policymakers need to be able to search these articles effectively. In this work, we present a zeroshot ranking algorithm that adapts to COVIDrelated scientific literature. Our approach filters training data from another collection down to medical-related queries, uses a neural reranking model pre-trained on scientific text (SciBERT), and filters the target document collection. This approach ranks top among zeroshot methods on the TREC COVID Round 1 leaderboard, and exhibits a P@5 of 0.80 and an nDCG@10 of 0.68 when evaluated on both Round 1 and 2 judgments. Despite not relying on TREC-COVID data, our method outperforms models that do. As one of the first search methods to thoroughly evaluate COVID-19 search, we hope that this serves as a strong baseline and helps in the global crisis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.