2015
DOI: 10.1002/asi.23416
|View full text |Cite
|
Sign up to set email alerts
|

The twist measure for IR evaluation: Taking user's effort into account

Abstract: We present a novel measure for ranking evaluation, called Twist (τ). It is a measure for informational intents, which handles both binary and graded relevance. τ stems from the observation that searching is currently a that searching is currently taken for granted and it is natural for users to assume that search engines are available and work well. As a consequence, users may assume the utility they have in finding relevant documents, which is the focus of traditional measures, as granted. On the contrary, th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 52 publications
0
6
0
Order By: Relevance
“…Twist (Ferro, Silvello, Keskustalo, Pirkola, & Järvelin, ) is a measure for informational intents, which handles both binary and graded relevance. Twist adopts a user model where the user scans the ranked list from top to bottom until s/he stops, and returns an estimate of the effort required by the user to traverse the ranked list.…”
Section: Grid Of Points Measures and Setupmentioning
confidence: 99%
“…Twist (Ferro, Silvello, Keskustalo, Pirkola, & Järvelin, ) is a measure for informational intents, which handles both binary and graded relevance. Twist adopts a user model where the user scans the ranked list from top to bottom until s/he stops, and returns an estimate of the effort required by the user to traverse the ranked list.…”
Section: Grid Of Points Measures and Setupmentioning
confidence: 99%
“…Finally, it would be interesting to experiment what happens in the case of graded-relevance judgments. Not only is this a natural setting for nDCG and ERR, it also opens up to other evaluation measures such as Graded Average Precision (GAP) and its extensions [Ferrante et al, 2014b;Robertson et al, 2010] or effort-based measures such as Twist [Ferro et al, 2016b].…”
Section: Discussionmentioning
confidence: 99%
“…We stem from [Angelini et al, 2014;Ferro et al, 2016b] for defining the basic concepts of topics, documents, ground-truth, run, and judged run. To the best of our knowledge, these basic concepts have not been explicitly defined in previous works [Amigó et al, 2013;Busin and Mizzaro, 2013;Maddalena and Mizzaro, 2014;Moffat, 2013].…”
Section: Preliminary Definitionsmentioning
confidence: 99%
See 1 more Smart Citation
“…For T09, T10, T13, T14, and T15, we perform a lenient mapping of the relevance judgments by considering as relevant both highly relevant and relevant documents. • Graded: normalized Discounted Cumulated Gain (nDCG) [30], Expected Reciprocal Rank (ERR) [13], and Twist [23]. For T07, we calculate nDCG using binary relevance by setting gain to 0 for non-relevant documents and to 5 for relevant.…”
Section: Methodsmentioning
confidence: 99%