2023
DOI: 10.1007/978-3-031-28241-6_7
|View full text |Cite
|
Sign up to set email alerts
|

A Unified Framework for Learned Sparse Retrieval

Abstract: Learned sparse retrieval (LSR) is a family of first-stage retrieval methods that are trained to generate sparse lexical representations of queries and documents for use with an inverted index. Many LSR methods have been recently introduced, with Splade models achieving state-of-the-art performance on MSMarco. Despite similarities in their model architectures, many LSR methods show substantial differences in effectiveness and efficiency. Differences in the experimental setups and configurations used make it dif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 45 publications
0
2
0
Order By: Relevance
“…We conduct a similar Pareto analysis for zero-shot retrieval for our work. LSR [31] is a recent concurrent work with ours. The authors provide a toolkit focused on sparse-retrieval model training and evaluate different training settings with in-domain datasets such as MS MARCO [32].…”
Section: Related Workmentioning
confidence: 90%
“…We conduct a similar Pareto analysis for zero-shot retrieval for our work. LSR [31] is a recent concurrent work with ours. The authors provide a toolkit focused on sparse-retrieval model training and evaluate different training settings with in-domain datasets such as MS MARCO [32].…”
Section: Related Workmentioning
confidence: 90%
“…PROP proposed a novel representative words prediction training task [63], while B-PROP further improves upon PROP by replacing PROP's classical unigram language model with a more powerful BERTbased contextual language model [64]. Other researchers trade off PLM effectiveness for efficiency by utilizing the PLM to improve document indexing [19,77], pre-computing intermediate Transformer representations [27,42,47,65], selecting query-aware key blocks within a document for input squeezing [48,55], using the PLM to build sparse representations [25,56,66,68,73,112,114], weighting offline pseudo-query and document relevance [11], or reducing the number of Transformer layers [34,36,72].…”
Section: Related Workmentioning
confidence: 99%
“…However, the evolution of information retrieval has integrated machine learning algorithms to generate document vectors containing term scores learned from the documents, akin to traditional term frequency. This integration of machine learning, primarily based on neural networks, has led to the emergence of Neural Information Retrieval [6].…”
Section: Introductionmentioning
confidence: 99%