Here or There

Carterette, Ben; Bennett, Paul; Chickering, David Maxwell; Dumais, Susan T.

doi:10.1007/978-3-540-78646-7_5

Cited by 72 publications

(53 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that with TABS, for a set of k search results, we collect one preference judgment out of a total of k(k−1)/2 possible judgments. We do so to make the human annotation task very light; pairwise preference judgments are easier to obtain from human assessors than requiring them to consistently rank a larger set of documents [3], [5]. Below we explain how we can use this highly incomplete set of preference judgments to evaluate ranking strategies.…”

Section: Evaluation Of Ranking Measures a Preference Judgments mentioning

confidence: 99%

“…Though the definition above is similar in spirit to the metrics proposed in [1], [2], [3], it is different in its computation of the accuracy of a ranking strategy. In all these previously proposed evaluation measures, more than one preference judgment per query is considered.…”

Section: Definition 8 (Ranking Accuracy)mentioning

confidence: 99%

“…Online applications that provide this service to their users, such as Twitter, FaceBook, and Orkut 1 , are being heavily used. For example, recent statistics show that Twitter currently has over 73 million users 2 , with more than 50,000 tweets (messages of up to 140 characters) being generated per minute at peak times 3 . Though there is little consensus among social scientists over the reason why people use such services, it is generally accepted that the ability to express opinions quickly and freely, and the ability to effectively reach a large audience is the main draw.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Ranking Approaches for Microblog Search

Nagmoti

Teredesai

Cock

2010

2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

View full text Add to dashboard Cite

Abstract-Ranking microblogs, such as tweets, as search results for a query is challenging, among other things because of the sheer amount of microblogs that are being generated in real time, as well as the short length of each individual microblog. In this paper, we describe several new strategies for ranking microblogs in a real-time search engine. Evaluating these ranking strategies is non-trivial due to the lack of a publicly available ground truth validation dataset. We have therefore developed a framework to obtain such validation data, as well as evaluation measures to assess the accuracy of the proposed ranking strategies. Our experiments demonstrate that it is beneficial for microblog search engines to take into account social network properties of the authors of microblogs in addition to properties of the microblog itself.

show abstract

Section: Evaluation Of Ranking Measures a Preference Judgments mentioning

confidence: 99%

Section: Definition 8 (Ranking Accuracy)mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Ranking Approaches for Microblog Search

Nagmoti

Teredesai

Cock

2010

2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

View full text Add to dashboard Cite

show abstract

“…Now, our key idea is to design a preference-based measure to score each ranked list by treating these inferred incomplete preference relations between documents as our golden standard. In this study we use precision of preference(ppref) [4].…”

Section: Methodsmentioning

confidence: 99%

Evaluating Search Engines by Clickthrough Data

2010

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. It is no doubt that search is critical to the web. And it will be of similar importance to the semantic web. Once searching from billions of objects, it will be impossible to always give a single right result, no matter how intelligent the search engine is. Instead, a set of possible results will be provided for the user to choose from. Moreover, if we consider the trade-off between the system costs of generating a single right result and a set of possible results, we may choose the latter. This will naturally lead to the question of how to decide on and present the set to the user and how to evaluate the outcome.In this paper, we introduce some new methodology in evaluation of web search technologies and systems. Historically, the dominant method for evaluating search engines is the Cranfield paradigm, which employs a test collection to qualify the systems' performance. However, the modern search engines are much different from the IR systems when the Cranfield paradigm was proposed: 1) Most modern search engines have much more features, such as snippets and query suggestions, and the quality of such features can affect the users' utility; 2) The document collections used in search engines are much larger than ever, so the complete test collection that contains all query-document judgments is not available. As response to the above differences and difficulties, the evaluation based on implicit feedback is a promising alternative employed in IR evaluation. With this approach, no extra human effort is required to judge the query-document relevance. Instead, such judgment information can be automatically predicted from real users' implicit feedback data. There are three key issues in this methodology: 1) How to estimate the query-document relevance and other useful features that useful to qualify the search engine performance; 2) If the complete "judgments" are not available, how can we efficiently collect the most critical information from which the system performance can be derived; 3) Because query-document relevance is not only feature that can affect the performance, how can we integrate others to be a good metric to predict the system performance. We will show a set of technologies dealing with these issues.

show abstract

“…It can be a challenging task for humans to produce graded coherence assessments of topics. Therefore, we apply a pairwise preference user study [22] to gather human judgments. A similar method has been previously used to compare summarisation algorithms [23].…”

Section: Comparison Of Coherence Metricsmentioning

confidence: 99%

Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data

Fang

Macdonald

Ounis

et al. 2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Twitter offers scholars new ways to understand the dynamics of public opinion and social discussions. However, in order to understand such discussions, it is necessary to identify coherent topics that have been discussed in the tweets. To assess the coherence of topics, several automatic topic coherence metrics have been designed for classical document corpora. However, it is unclear how suitable these metrics are for topic models generated from Twitter datasets. In this paper, we use crowdsourcing to obtain pairwise user preferences of topical coherences and to determine how closely each of the metrics align with human preferences. Moreover, we propose two new automatic coherence metrics that use Twitter as a separate background dataset to measure the coherence of topics. We show that our proposed Pointwise Mutual Information-based metric provides the highest levels of agreement with human preferences of topic coherence over two Twitter datasets.

show abstract

Here or There

Cited by 72 publications

References 11 publications

Ranking Approaches for Microblog Search

Ranking Approaches for Microblog Search

Evaluating Search Engines by Clickthrough Data

Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data

Contact Info

Product

Resources

About