When does Relevance Mean Usefulness and User Satisfaction in Web Search?

Mao, Jiaxin; Liu, Yiqun; Zhou, Ke; Nie, Jian‐Yun; Jing-tao, Song; Zhang, Min; Ma, Shaoping; Sun, Jiashen; Luo, Huiwu

doi:10.1145/2911451.2911507

Cited by 77 publications

(50 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, as indicated by recent studies, relevance-based evaluation metrics, such as MAP and nDCG, may not be perfectly correlated with users' search experience [2,20]. Recently, Mao et al [34] further studied the relationship between relevance, usefulness and satisfaction and also suggested that traditional system-centric evaluation metrics are not well aligned with user satisfaction.…”

Section: Search Satisfactionmentioning

confidence: 99%

“…While o ine metrics are especially valuable when evaluating a system in prior to its deployment [13,34], online metrics have been widely adopted for modern search engines because such metrics are calculated based on the interactions between practical users and systems. Inspired by previous research on metrics meta-evaluation [9,11,15,19], we compare the evaluation performance of some most widely-used online metrics, including:…”

Section: Comparison Across O Line Metricsmentioning

confidence: 99%

“…is is reasonable because the o ine metrics measure the quality of search result page based on relevance assessments and users usually tend to feel more satis ed if the search results are of high quality [34]. On the contrary, the interactions signals which online metrics adopted usually re ects search e ort and high search e ort can reduce user satisfaction [10,21].…”

Section: Comparison Across Online Metricsmentioning

confidence: 99%

See 2 more Smart Citations

Meta-evaluation of Online and Offline Web Search Evaluation Metrics

Chen

Zhou

Liu

et al. 2017

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Self Cite

View full text Add to dashboard Cite

As in most information retrieval (IR) studies, evaluation plays an essential part in Web search research. Both o ine and online evaluation metrics are adopted in measuring the performance of search engines. O ine metrics are usually based on relevance judgments of query-document pairs from assessors while online metrics exploit the user behavior data, such as clicks, collected from search engines to compare search algorithms. Although both types of IR evaluation metrics have achieved success, to what extent can they predict user satisfaction still remains under-investigated. To shed light on this research question, we meta-evaluate a series of existing online and o ine metrics to study how well they infer actual search user satisfaction in di erent search scenarios. We nd that both types of evaluation metrics signi cantly correlate with user satisfaction while they re ect satisfaction from di erent perspectives for di erent search tasks. O ine metrics be er align with user satisfaction in homogeneous search (i.e. ten blue links) whereas online metrics outperform when vertical results are federated. Finally, we also propose to incorporate mouse hover information into existing online evaluation metrics, and empirically show that they be er align with search user satisfaction than click-based online metrics.

show abstract

Section: Search Satisfactionmentioning

confidence: 99%

Section: Comparison Across O Line Metricsmentioning

confidence: 99%

Section: Comparison Across Online Metricsmentioning

confidence: 99%

See 1 more Smart Citation

Meta-evaluation of Online and Offline Web Search Evaluation Metrics

Chen

Zhou

Liu

et al. 2017

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Self Cite

View full text Add to dashboard Cite

show abstract

“…Attempts have been made to use pre-task and post-task questionnaires to predict user satisfaction [21], but it is yet to effectively capture user context; for instance, users might be instructed to use a 6-point rating scale to state their familiarity with a given web document but it might not be possible to obtain information about how much understanding the user has to a current topic. Although Mao et al [22] claimed that with a clear search context, external assessors can annotate usefulness, they, however, did not state how to adequately capture search context. Domain knowledge can be measured based on experience or previous study but such classification may not lead to a concise interpretation of the results of a study.…”

Section: Related Workmentioning

confidence: 99%

Factors Affecting Users‟ Measure of Interest: A Study of the Effect of Task, Document Difficulty and Document Familiarity

Akuma¹,

Jayne²

2019

IJITCS

View full text Add to dashboard Cite

Data on the web is constantly growing which may affect users" ability to find relevant information within a reasonable time limit. Some of the factors previously studied that affect users searching behaviour are task difficulty and topic familiarity. In this paper, we consider a set of implicit feedback parameters to investigate how document difficulty and document familiarity affects users searching behaviour in a taskspecific context. An experiment was conducted and data was collected from 77 undergraduate students of Computer science. Users" implicit features and explicit ratings of document difficulty and familiarity were captured and logged through a plugin in Firefox browser. Implicit feedback parameters were correlated with user ratings for document difficulty and familiarity. The result showed no correlation between implicit feedback parameters and the rating for document familiarity. There was, however, a negative correlation between user mouse activities and document difficulty ratings.Also, the dataset of all the participants in the experiment was grouped according to task type and analysed. The result showed that their behaviour varies according to task type. Our findings provide more insight into studying the moderating factors that affect user searching behaviour.

show abstract

“…More recently it has been proposed that research and evaluation in IR should take a more sophisticated view of the objective of IR to include measuring features such as utility of retrieved information, the user's knowledge, or some combination of these parameters [1,12]. In response to this IR researchers have begun to look beyond traditional topical relevance to evaluation metrics such as usefulness, effort and readability [3,10,16] to capture and satisfy user needs effectively but at the document or IR Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.…”

Section: Introductionmentioning

confidence: 99%

Identifying Useful and Important Information within Retrieved Documents

Arora

Jones

2017

Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval

View full text Add to dashboard Cite

We describe an initial study into the identification of important and useful information units within documents retrieved by an information retrieval system in response to a user query created in response to an underlying information need. This study is part of a larger investigation of the exploitation of useful and important units from retrieved documents to generate rich document surrogates to improve user search experience. We report three user studies using a crowdsourcing platform, where participants were first asked to read an information need and contents of a relevant document and then to perform actions depending on the type of study: i) write important information units (WIIU), ii) highlight important information units (HIIU) and iii) assess importance of already highlighted information units (AIHIU). Further, we discuss a novel mechanism for measuring similarities between content annotations. We find majority agreement of about 0.489 and pairwise agreement of 0.340 among users annotation in the AIHIU study, and average cosine similarity of 0.50 and 0.57 between participant annotations and documents in the WIIU and HIIU studies respectively.

show abstract

When does Relevance Mean Usefulness and User Satisfaction in Web Search?

Cited by 77 publications

References 39 publications

Meta-evaluation of Online and Offline Web Search Evaluation Metrics

Meta-evaluation of Online and Offline Web Search Evaluation Metrics

Factors Affecting Users‟ Measure of Interest: A Study of the Effect of Task, Document Difficulty and Document Familiarity

Identifying Useful and Important Information within Retrieved Documents

Contact Info

Product

Resources

About