Repeatable and reliable search system evaluation using crowdsourcing

Blanco, Roi; Halpin, Harry; Herzig, Daniel M.; Mika, Peter; Pound, Jeffrey; Thompson, Henry S.; Duc, Thanh Tran

doi:10.1145/2009916.2010039

Cited by 82 publications

(76 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While LTR offers high performance, it critically depends on the availability of relevance judgments for training. We observed from our experiments based on real users (via a crowd sourcing based evaluation recently proposed in [3]) that the final results strongly correlate with the number of visits (#visits) that is captured in the access logs. We provide a detailed analysis of this correlation and for the case where training data and ground truth is not easy to obtain, we propose the use of #visits as an alternative.…”

Section: Introductionmentioning

confidence: 80%

“…As performance measures, we use the standard measures NDCG and Spearman's correlation coefficient. We build upon the data, queries and methodology proposed by the recent SemSearch Challenge evaluation initiative [3] …”

Section: Methodsmentioning

confidence: 99%

“…We have two sets of queries 3 . The first set is a subset of the entity queries provided by the SemSearch Challenge dataset.…”

Section: Datasets and Queriesmentioning

confidence: 99%

See 2 more Smart Citations

Query-Independent Learning to Rank for RDF Entity Search

Dali

Fortuna

Duc

et al. 2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. The amount of structured data is growing rapidly. Given a structured query that asks for some entities, the number of matching candidate results is often very high. The problem of ranking these results has gained attention. Because results in this setting equally and perfectly match the query, existing ranking approaches often use features that are independent of the query. A popular one is based on the notion of centrality that is derived via PageRank. In this paper, we adopt learning to rank approach to this structured query setting, provide a systematic categorization of query-independent features that can be used for that, and finally, discuss how to leverage information in access logs to automatically derive the training data needed for learning. In experiments using real-world datasets and human evaluation based on crowd sourcing, we show the superior performance of our approach over two relevant baselines.

show abstract

Section: Introductionmentioning

confidence: 80%

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Query-Independent Learning to Rank for RDF Entity Search

Dali

Fortuna

Duc

et al. 2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…Noteworthy, one source (MS) is dominant in the publication scenario and is part of 83% of the co-references. We followed the methodology of [15] to obtain relevance judgments for the ranking evaluation. We rated the top-10 results for each query.…”

Section: Methodsmentioning

confidence: 99%

Federated Entity Search Using On-the-Fly Consolidation

Herzig

Mika

Blanco

et al. 2013

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Abstract. Nowadays, search on the Web goes beyond the retrieval of textual Web sites and increasingly takes advantage of the growing amount of structured data. Of particular interest is entity search, where the units of retrieval are structured entities instead of textual documents. These entities reside in different sources, which may provide only limited information about their content and are therefore called "uncooperative". Further, these sources capture complementary but also redundant information about entities. In this environment of uncooperative data sources, we study the problem of federated entity search, where redundant information about entities is reduced on-the-fly through entity consolidation performed at query time. We propose a novel method for entity consolidation that is based on using language models and completely unsupervised, hence more suitable for this on-the-fly uncooperative setting than state-of-the-art methods that require training data. Further, we apply the same language model technique to deal with the federated search problem of ranking results returned from different sources. Particular novel are the mechanisms we propose to incorporate consolidation results into this ranking. We perform experiments using real Web queries and data sources. Our experiments show that our approach for federated entity search with on-the-fly consolidation improves upon the performance of a state-of-the-art preference aggregation baseline and also benefits from consolidation.

show abstract

“…The fact that the pool will have versions assessed by different judges over time is not a problem. The ranking between the judged systems will be the same as if judges would have assessed all documents in the same day [29].…”

Section: Reusabilitymentioning

confidence: 99%

Evaluating Web Archive Search Systems

Costa

Silva

2012

Web Information Systems Engineering - WISE 2012

View full text Add to dashboard Cite

Abstract. The information published on the web, a representation of our collective memory, is rapidly vanishing. At least 77 web archives have been developed to cope with the web's transience problem, but despite their technology having achieved a good maturity level, the retrieval effectiveness of the search services they provide still presents unsatisfactory results. In this work, we propose an evaluation methodology for web archive search systems based on a list of requirements compiled from previous characterizations of web archives and their users. The methodology includes the design of a test collection and the selection of evaluation measures to support realistic and reproducible experiments. The test collection enabled, for the first time, to measure the effectiveness of state-of-the-art IR technology employed in web archives. Results confirm the poor quality of search results retrieved with such technology. However, we show how to combine temporal features, along with the regular topical features, to improve the search effectiveness on web archives. The test collection is available to the research community.

show abstract

Repeatable and reliable search system evaluation using crowdsourcing

Cited by 82 publications

References 15 publications

Query-Independent Learning to Rank for RDF Entity Search

Query-Independent Learning to Rank for RDF Entity Search

Federated Entity Search Using On-the-Fly Consolidation

Evaluating Web Archive Search Systems

Contact Info

Product

Resources

About