2019
DOI: 10.1145/3345001
|View full text |Cite
|
Sign up to set email alerts
|

Boosting Search Performance Using Query Variations

Abstract: Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. However, to date fusion has not been regarded as being cost-effective in cases where strict perquery efficiency guarantees are required, such as in web search. In this work we propose a novel solution to rank fusion by splitting the computation into two parts -one phase that is carried out offline to generate pre-computed centroid answers for queries with broadly similar information needs, a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 26 publications
(27 citation statements)
references
References 72 publications
(97 reference statements)
0
22
0
Order By: Relevance
“…Bailey et al [8] also employed crowdsourcing, constructing a test collection that associates multiple user queries with each information need, with each of those needs expressed as a personalized text backstory derived from a single TREC topic. Having a set of user query variations associated with each of the TREC topics, rather than just a single query, has enabled enhanced understanding in a range of areas: test collection judgment pool methodology [37]; the consistency [9] and risk [11] properties of retrieval models; the quality of automatic query generation approaches [32]; and new implementation options for efficient search on web corpora [14]. Similar query collections have also been created by teams working on TREC-initiated activities [12,13], adopting the notion of an information need expressed as a backstory, and also adopting the previous mode of presentation of the backstory, as text to be read by the crowdworker.…”
Section: Background and Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…Bailey et al [8] also employed crowdsourcing, constructing a test collection that associates multiple user queries with each information need, with each of those needs expressed as a personalized text backstory derived from a single TREC topic. Having a set of user query variations associated with each of the TREC topics, rather than just a single query, has enabled enhanced understanding in a range of areas: test collection judgment pool methodology [37]; the consistency [9] and risk [11] properties of retrieval models; the quality of automatic query generation approaches [32]; and new implementation options for efficient search on web corpora [14]. Similar query collections have also been created by teams working on TREC-initiated activities [12,13], adopting the notion of an information need expressed as a backstory, and also adopting the previous mode of presentation of the backstory, as text to be read by the crowdworker.…”
Section: Background and Motivationmentioning
confidence: 99%
“…A curated subset of the topics that survived the filtering stages was then created. Fifteen viable topics from each month spanned by the collection were selected at random, and each was then inspected by two of a panel of six IR experts, 14 taking the Reddit thread title to be ground truth. In this blind experiment each expert considered a sequence of Reddit thread titles, document titles, and short and long summaries, with the latter two drawn from either the Extractive or Intro approaches at random; and for each of those summary options was asked to assess how accurately it conveyed the assumed intent of the Reddit title, using a fivepoint Likert scale, with five indicating "accurate".…”
Section: Generating Backstoriesmentioning
confidence: 99%
“…If this relation does not hold, then the current document is not able to make it into the top-k results, and processing is continued for the next document (line 24). Otherwise, the pivot document is seeked in the current list (line 26), and if found, the score computed (line [27][28][29]. This loop will continue until either the document cannot make the heap, or the document is fully scored.…”
Section: Document-at-a-timementioning
confidence: 99%
“…In the third pane, document 10 is being scored for the non-essential lists. Since the partial score summed with the cumulative upper-bound of the non-essential lists (6 + 3 = 9) is greater than θ, document 10 must be scored in the non-essential lists (line [22][23][24][25][26][27][28][29][30]. The fourth pane shows document 10 being found and scored in the "best" list, resulting in a total score of 9 (line 27-29).…”
Section: Document-at-a-timementioning
confidence: 99%
See 1 more Smart Citation