Kevin Martin Jose scite author profile

While a number of recent open-source toolkits for training and using neural information retrieval models have greatly simplified experiments with neural reranking methods, they essentially hard code a "search-then-rerank" experimental pipeline. These pipelines consist of an efficient first-stage ranking method, like BM25, followed by a neural reranking method. Deviations from this setup often require hacks; some improvements, like adding a second reranking step that uses a more expensive neural method, are infeasible without major code changes. In order to improve the flexibility of such toolkits, we propose implementing experimental pipelines as dependency graphs of functional "IR primitives, " which we call modules, that can be used and combined as needed. For example, a neural IR pipeline may rerank results from a Searcher module that efficiently retrieves results from an Index module that it depends on. In turn, the Index depends on a Collection to index, which is provided by the pipeline. This Searcher module is self-contained: the pipeline does not need to know about or interact with the Index of the Searcher, which is transparently shared among Searcher modules when possible (e.g., a BM25 and a QL Searcher might share the same Index). Similarly, a Reranker module might depend on a Trainer (e.g., Tensorflow), feature Extractor, Tokenizer, etc. In both cases, the pipeline needs to interact only with the Reranker or Searcher directly; the complexity of their dependencies is hidden and intelligently managed. We rewrite the Capreolus toolkit to take this approach and demonstrate its use.

show abstract

DiffIR: Exploring Differences in Ranking Models' Behavior

Jose

Nguyen

MacAvaney

et al. 2021

View full text Add to dashboard Cite

Understanding and comparing the behavior of retrieval models is a fundamental challenge that requires going beyond examining average effectiveness and per-query metrics, because these do not reveal key differences in how ranking models' behavior impacts individual results. DiffIR is a new open-source web tool to assist with qualitative ranking analysis by visually 'diffing' system rankings at the individual result level for queries where behavior significantly diverges. Using one of several configurable similarity measures, it identifies queries for which the rankings of models compared have important differences in individual rankings and provides a visual web interface to compare the rankings side-by-side. DiffIR additionally supports a model-specific visualization approach based on custom term importance weight files. These support studying the behavior of interpretable models, such as neural retrieval methods that produce document scores based on a similarity matrix or based on a single document passage. Observations from this tool can complement neural probing approaches like ABNIRML to generate quantitative tests. We provide an illustrative use case of DiffIR by studying the qualitative differences between recently developed neural ranking models on a standard TREC benchmark dataset. CCS CONCEPTS• Information systems → Retrieval effectiveness.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kevin Martin Jose

Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval

Flexible IR Pipelines with Capreolus

DiffIR: Exploring Differences in Ranking Models' Behavior

Contact Info

Product

Resources

About