2020
DOI: 10.1609/aaai.v34i05.6282
|View full text |Cite
|
Sign up to set email alerts
|

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Abstract: We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale datas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

5
245
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 164 publications
(251 citation statements)
references
References 16 publications
5
245
0
1
Order By: Relevance
“…Yang et al [23] applied it to Ad Hoc Document Retrieval, obtaining significant improvement. Garg et al [8] fine-tuned BERT for AS2, achieving the state of the art. However, BERT's high computational cost prevents its use in most real-word applications.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Yang et al [23] applied it to Ad Hoc Document Retrieval, obtaining significant improvement. Garg et al [8] fine-tuned BERT for AS2, achieving the state of the art. However, BERT's high computational cost prevents its use in most real-word applications.…”
Section: Related Workmentioning
confidence: 99%
“…AS2, given a question and a set of answer sentence candidates, consists in selecting sentences (e.g., retrieved by a search engine) that correctly answer the question. Neural models have significantly contributed with new techniques, e.g., [8,11] to AS2. More recently, neural language models, e.g., ELMO [13], GPT [14], BERT [5], RoBERTa [10], XLNet [3] have led to major advancements in NLP.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…However, more recent neural models, such as BERT based models [Garg et al 2019;Raffel et al 2019] have achieved a largely improved performance on the same tasks. It is worth exploring the integration of quantum models with the new BERT architecture [Devlin et al 2019] in the future.…”
Section: Quantum-inspired Neural Representation Modelsmentioning
confidence: 99%
“…Note that the current state-of-the-art BERT-based neural model for TREC-QA has achieved MAP and MRR of 0.943 and 0.974 respectively[Garg et al 2019].4 The TANDA model mentioned above currently gives the best performance on WikiQA dataset (MAP and MRR of 0.92 and 0.933 respectively, as compared to 0.695 and 0.71 by[Zhang et al 2018d]). …”
mentioning
confidence: 95%