The Cascade Transformer: an Application for Efficient Answer Sentence Selection

Soldaini, Luca; Moschitti, Alessandro

doi:10.18653/v1/2020.acl-main.504

Cited by 40 publications

(44 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One key limitation of BERT is its inability to handle long input sequences and hence difficulty in ranking texts beyond a certain length (e.g., "full-length" documents such as news articles). This limitation is addressed by a number of models (Nogueira and Cho, 2019;Akkalyoncu Yilmaz et al, 2019;Dai and Callan, 2019b;MacAvaney et al, 2019;, and a simple retrieve-then-rerank approach can be elaborated into a multi-stage architecture with reranker pipelines (Nogueira et al, 2019a;Matsubara et al, 2020;Soldaini and Moschitti, 2020) that balance effectiveness and efficiency. On top of multi-stage ranking architectures, researchers have proposed additional innovations, including query expansion , document expansion (Nogueira et al, 2019b;Nogueira and Lin, 2019) and term importance prediction Callan, 2019a, 2020).…”

Section: Multi-stage Ranking Architecturesmentioning

confidence: 99%

Pretrained Transformers for Text Ranking: BERT and Beyond

Yates¹,

Nogueira²,

Lin³

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Section: Multi-stage Ranking Architecturesmentioning

confidence: 99%

Pretrained Transformers for Text Ranking: BERT and Beyond

Yates¹,

Nogueira²,

Lin³

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

“…Finally, our word-relatedness encoder can replace the standard attention to enhance the speed of the fast attention-based approaches, resulting in a fast and accurate network in the class of fast methods. Our research will gain more and more important also in the light of improving the efficiency of large architecture using our models for sequential re-ranking (Matsubara et al, 2020;Soldaini and Moschitti, 2020).…”

Section: Introductionmentioning

confidence: 99%

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

Bonadiman

Moschitti

2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

An essential task of most Question Answering (QA) systems is to re-rank the set of answer candidates, i.e., Answer Sentence Selection (AS2). These candidates are typically sentences either extracted from one or more documents preserving their natural order or retrieved by a search engine. Most state-of-the-art approaches to the task use huge neural models, such as BERT, or complex attentive architectures. In this paper, we argue that by exploiting the intrinsic structure of the original rank together with an effective word-relatedness encoder, we achieve the highest accuracy among the cost-efficient models, with two orders of magnitude fewer parameters than the current state of the art. Our model takes 9.5 seconds to train on the WikiQA dataset, i.e., very fast in comparison with the ∼ 18 minutes required by a standard BERT-base fine-tuning.

show abstract

“…We can see that in both datasets, early exiting is able to accelerate inference by ∼2.5× while maintaining the original model effectiveness. It is worth noting that in Cascade Transformer (CT) (Soldaini and Moschitti, 2020), only a part of the development set is used for evaluation, and therefore the scores are not directly comparable. However, in terms of relative performance, our model appears to achieve a bit higher inference speedup with a comparable score degradation.…”

Section: Resultsmentioning

confidence: 99%

“…Our work differs from them by using an early exiting strategy that specializes for document ranking. Another related work that focuses on retrieval is Cascade Transformer (Soldaini and Moschitti, 2020), where a fixed proportion of samples are dropped after each layer. In contrast, our work drops samples based on their scores, and empirically we are able to achieve higher inference speedups.…”

Section: Related Workmentioning

confidence: 99%

“…We conduct experiments on BERT BASE with two document ranking datasets, MS MARCO passage (Bajaj et al, 2016) and ASNQ (Garg et al, 2019). We compare against Cascade Transformer (Soldaini and Moschitti, 2020), a recently proposed technique to accelerate inference in BERT-based document ranking. Results show that our method can reduce inference latency by up to 2.5× with minimal effectiveness degradation.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Early Exiting BERT for Efficient Document Ranking

Ji¹,

Nogueira²,

Yu³

et al. 2020

Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

View full text Add to dashboard Cite

Pre-trained language models such as BERT have shown their effectiveness in various tasks. Despite their power, they are known to be computationally intensive, which hinders realworld applications. In this paper, we introduce early exiting BERT for document ranking. With a slight modification, BERT becomes a model with multiple output paths, and each inference sample can exit early from these paths. In this way, computation can be effectively allocated among samples, and overall system latency is significantly reduced while the original quality is maintained. Our experiments on two document ranking datasets demonstrate up to 2.5× inference speedup with minimal quality degradation. The source code of our implementation can be found at https://github.com/ castorini/earlyexiting-monobert.

show abstract

The Cascade Transformer: an Application for Efficient Answer Sentence Selection

Cited by 40 publications

References 44 publications

Pretrained Transformers for Text Ranking: BERT and Beyond

Pretrained Transformers for Text Ranking: BERT and Beyond

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

Early Exiting BERT for Efficient Document Ranking

Contact Info

Product

Resources

About