Proceedings of the 2nd Workshop on Machine Reading for Question Answering 2019
DOI: 10.18653/v1/d19-5819
|View full text |Cite
|
Sign up to set email alerts
|

ReQA: An Evaluation for End-to-End Answer Retrieval Models

Abstract: Popular QA benchmarks like SQuAD have driven progress on the task of identifying answer spans within a specific passage, with models now surpassing human performance. However, retrieving relevant answers from a huge corpus of documents is still a challenging problem, and places different requirements on the model architecture. There is growing interest in developing scalable answer retrieval models trained end-to-end, bypassing the typical document retrieval step. In this paper, we introduce Retrieval Question… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
48
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(50 citation statements)
references
References 31 publications
2
48
0
Order By: Relevance
“…Seo et al (2019) introduced the phrase level representation model that index every potential answer span as vector representation and exploited approximate nearest neighbour (ANN) methods to retrieve the final answer span directly from a large vector index (Slaney and Casey, 2008). Ahmad et al (2019) argued that phrase-level answer may not always be required or preferred. Instead they proposed to find the right "sentence" as an answer from large body of text, and used universal sentence encoder (Cer et al, 2018) to retrieved the correct sentence given a question.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Seo et al (2019) introduced the phrase level representation model that index every potential answer span as vector representation and exploited approximate nearest neighbour (ANN) methods to retrieve the final answer span directly from a large vector index (Slaney and Casey, 2008). Ahmad et al (2019) argued that phrase-level answer may not always be required or preferred. Instead they proposed to find the right "sentence" as an answer from large body of text, and used universal sentence encoder (Cer et al, 2018) to retrieved the correct sentence given a question.…”
Section: Related Workmentioning
confidence: 99%
“…Our approach follows the sentence-level QA system from (Ahmad et al, 2019) for two reasons: (1) answers to many research questions cannot be cov-ered in a short phrase-level span, and a sentence answer can provide more context to deliver relevant solutions. (2) our preliminary study found that it is important to have a trainable retriever that goes beyond TF-IDF keyword matching to ensure enough recall in the paper domain.…”
Section: Related Workmentioning
confidence: 99%
“…For example, typical passage-level retrieval systems sets the a to be the passage and leaves c empty (Chen et al, 2017;Yang et al, 2019a). The sentence-level retrieval task proposed at sets a to be each sentence in a text knowledge base and c to be the surrounding text (Ahmad et al, 2019). Lastly, the phrase-level QA system sets a to be all valid phrases from a corpus and c to be the surrounding text .…”
Section: Problem Formulationmentioning
confidence: 99%
“…Experiments are conducted on two settings: OpenQA (Chen et al, 2017) that requires phraselevel answers and retrieval QA (ReQA) that requires sentence-level answers (Ahmad et al, 2019). Our proposed SpartaQA system achieves new stateof-the-art results across 15 different domains and 2 languages with significant performance gain, including OpenSQuAD, OpenCMRC and etc.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, several researchers have proposed different deep neural models in text-based QA that compare two segments of texts and produces a similarity score. Both documentlevel (Chen et al, 2017;Seo et al, 2018Seo et al, , 2019Wu et al, 2018) and sentence-level (Ahmad et al, 2019; retrieval has been studied on many public datasets such as SQuAD (Rajpurkar et al, 2016) and NQ (Kwiatkowski et al, 2019).…”
Section: Related Workmentioning
confidence: 99%