2012
DOI: 10.1007/978-3-642-28997-2_46
|View full text |Cite
|
Sign up to set email alerts
|

Using a Medical Thesaurus to Predict Query Difficulty

Abstract: Abstract. Estimating query performance is the task of predicting the quality of results returned by a search engine in response to a query. In this paper, we focus on pre-retrieval prediction methods for the medical domain. We propose a novel predictor that exploits a thesaurus to ascertain how difficult queries are. In our experiments, we show that our predictor outperforms the state-of-the-art methods that do not use a thesaurus.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
9
0

Year Published

2013
2013
2015
2015

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 6 publications
0
9
0
Order By: Relevance
“…Thus, the clarity score of a query is computed here at a post retrieval stage, as the Kullback‐Leiber divergence between the query language model and the collection language model. Clarity through topic coverage CTCla ( Q ): a query is assumed to be as much clear as it covers a few general semantic levels of MeSH terminology (Znaidi et al., ). This score is computed at the preretrieval stage. Clarity through concept coverage CCCla ( Q ): a query is assumed to be as much clear as query words match concepts issued from MeSH terminology (Boudin et al., ). This score is also computed at the preretrieval stage.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Thus, the clarity score of a query is computed here at a post retrieval stage, as the Kullback‐Leiber divergence between the query language model and the collection language model. Clarity through topic coverage CTCla ( Q ): a query is assumed to be as much clear as it covers a few general semantic levels of MeSH terminology (Znaidi et al., ). This score is computed at the preretrieval stage. Clarity through concept coverage CCCla ( Q ): a query is assumed to be as much clear as query words match concepts issued from MeSH terminology (Boudin et al., ). This score is also computed at the preretrieval stage.…”
Section: Methodsmentioning
confidence: 99%
“…Many studies in the literature have investigated different aspects of medical‐ and health‐related information seeking and retrieval. These studies generally relied on empirical evaluations conducted with samples of users in order to investigate the users' information need peculiarities (Lykke, Price, & Delcambre, ; Spink et al., ; Zhang & Fu, ), query difficulty (Lykke et al., ; W. Hersh et al., ; Boudin, Nie, & Dawes, ), user behavior (Dogan, Muray, Neveol, & Lu, ; Ely et al., ), the effect of context on search (Cartright, While, & Horvitz, ; Freund, Toms, & Waterhouse, ; Lykke et al., ; White, Dumais, & Teevan, ), the search accuracy, the quality and the reliability of medical information (Moturu, Liu, & Johnson, ; Pandolfini, ), etc. Related findings provide insights into medical information search activity and suggest implications for the design of improved medical IR systems.…”
Section: Studies On Medical‐ and Health‐related Information Needs Andmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, thesaurus is 58 used to predict query difficulty in medical domain. It was con-59 cluded that the performance of the predictor is influencing with 60 many factors such as the coverage of thesaurus or query map-61 ping quality [8]. Earlier studies assumed that there are no gen-62 eral thesauri such that sufficient coverage are available, so that 63 the use and impact of thesaurus was not studied widely [8].…”
mentioning
confidence: 99%
“…It was con-59 cluded that the performance of the predictor is influencing with 60 many factors such as the coverage of thesaurus or query map-61 ping quality [8]. Earlier studies assumed that there are no gen-62 eral thesauri such that sufficient coverage are available, so that 63 the use and impact of thesaurus was not studied widely [8]. 64 However, a high quality thesaurus is available for some specific 65 domains, also many thesauri with different coverage abilities and 66 sizes are found in the same domain.…”
mentioning
confidence: 99%