SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

Zhao, Tiancheng; Lu, Xiaopeng; Lee, Kyusong

doi:10.18653/v1/2021.naacl-main.47

Cited by 27 publications

(16 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, recent works addressed ranking with sequence-to-sequence transformers based approach as the Mono-T5 model [31] for re-ranking documents returned by a BM25 ranker. Using a weak initial ranker such as BM25 may be the bottleneck of reaching higher performances, some approaches are thus reconsidering dense retrieval [14,15,7,44]. All these models are data-dependent, relying on word/topic/query distribution in the training dataset and their application to new domains is not always straightforward [28,32].…”

Section: Related Workmentioning

confidence: 99%

Continual Learning of Long Topic Sequences in Neural Information Retrieval

Gerald,

Soulier

2022

Preprint

View full text Add to dashboard Cite

In information retrieval (IR) systems, trends and users' interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to understand the transfer capacity of recent IR approaches to address new domains in the long term. In this paper, we first propose a dataset based upon the MSMarco corpus aiming at modeling a long stream of topics as well as IR property-driven controlled settings. We then in-depth analyze the ability of recent neural IR models while continually learning those streams. Our empirical study highlights in which particular cases catastrophic forgetting occurs (e.g., level of similarity between tasks, peculiarities on text length, and ways of learning models) to provide future directions in terms of model design.

show abstract

Section: Related Workmentioning

confidence: 99%

Continual Learning of Long Topic Sequences in Neural Information Retrieval

Gerald,

Soulier

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Along with the success of deep learning that offers remarkable semantic representation, various deep retrieval models have been developed in the past few years, greatly enhancing retrieval effectiveness and thus lifting final QA performance. According to the different ways of encoding the question and document as well as of scoring their similarity, dense retrievers in existing OpenQA systems can be roughly divided into three types: Representation-based Retriever [16], [29], [36], [72], Interaction-based Retriever [15], [31], and Representation-interaction Retriever [17], [81], as illustrated in Fig. 5.…”

Section: Dense Retrievermentioning

confidence: 99%

“…Representation-interaction Retriever: In order to achieve both high accuracy and efficiency, some recent systems [17], [81] combine representation-based and interaction-based methods. For instance, ColBERT-QA [17] develops its retriever based on ColBERT [82], which extends the dual-encoder architecture by performing a simple token-level interaction step over the question and document representations to calculate the similarity score.…”

Section: Retriever-only Denspimentioning

confidence: 99%

“…Then, ColBERT computes the score of each token embedding of the question over all those of the document first, and then sums all these scores as the final relevance score between q and d. As another example, SPARTA [81] develops a neural ranker to calculate the token-level matching score using dot product between a non-contextualized encoded (e.g., BERT word embedding) question and a contextualized encoded (e.g., BERT encoder) document. Concretely, given the representations of the question and document, the weight of each question token is computed with max-pooling, ReLU and log sequentially; the final relevance score is the sum of each question token weight.…”

Section: Retriever-only Denspimentioning

confidence: 99%

See 1 more Smart Citation

Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering

Zhu,

Lei,

Wang

et al. 2021

Preprint

View full text Add to dashboard Cite

Open-domain Question Answering (OpenQA) is an important task in Natural Language Processing (NLP), which aims to answer a question in the form of natural language based on large-scale unstructured documents. Recently, there has been a surge in the amount of research literature on OpenQA, particularly on techniques that integrate with neural Machine Reading Comprehension (MRC). While these research works have advanced performance to new heights on benchmark datasets, they have been rarely covered in existing surveys on QA systems. In this work, we review the latest research trends in OpenQA, with particular attention to systems that incorporate neural MRC techniques. Specifically, we begin with revisiting the origin and development of OpenQA systems. We then introduce modern OpenQA architecture named "Retriever-Reader" and analyze the various systems that follow this architecture as well as the specific techniques adopted in each of the components. We then discuss key challenges to developing OpenQA systems and offer an analysis of benchmarks that are commonly used. We hope our work would enable researchers to be informed of the recent advancement and also the open challenges in OpenQA research, so as to stimulate further progress in this field.

show abstract

“…There are two popular approaches in conversational chatbot modeling namely Transformer network-based models such as [13]- [15] and recurrent neural network (RNN)-based sequence to sequence learning (Seq2Seq) models such as [16]- [22]. The Transformer network is based on the feed-forward network [11], wherein sentences are processed as a whole rather than word by word by utilizing a self-attention mechanism, which can be highly parallelized.…”

Section: Introductionmentioning

confidence: 99%