The Context-Dependent Additive Recurrent Neural Net

Tran, Quan Hung; Lai, Tuan Manh; Haffari, Gholamreza; Zukerman, Ingrid; Bui, Trung; Bui, Hung

doi:10.18653/v1/n18-1115

Cited by 19 publications

(7 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…with increasingly challenging types of questions. For the task of AS2, initial efforts embedded the question and candidates using CNNs (Severyn and Moschitti, 2015), weight aligned networks (Shen et al, 2017;Tran et al, 2018;Tay et al, 2018) and compare-aggregate architectures (Wang and Jiang, 2016;Bian et al, 2017;Yoon et al, 2019). Recent progress has stemmed from the application of transformer models for performing AS2 (Garg et al, 2020;Han et al, 2021;Lauriola and Moschitti, 2021).…”

Section: Related Workmentioning

confidence: 99%

Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering

Garg¹,

Moschitti²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

In this paper we propose a novel approach towards improving the efficiency of Question Answering (QA) systems by filtering out questions that will not be answered by them. This is based on an interesting new finding: the answer confidence scores of state-of-the-art QA systems can be approximated well by models solely using the input question text. This enables preemptive filtering of questions that are not answered by the system due to their answer confidence scores being lower than the system threshold. Specifically, we learn Transformer-based question models by distilling Transformer-based answering models. Our experiments on three popular QA datasets and one industrial QA benchmark demonstrate the ability of our question models to approximate the Precision/Recall curves of the target QA system well. These question models, when used as filters, can effectively trade off lower computation cost of QA systems for lower Recall, e.g., reducing computation by ∼60%, while only losing ∼3−4% of Recall.

show abstract

Section: Related Workmentioning

confidence: 99%

Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering

Garg¹,

Moschitti²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…Answer Sentence Selection (AS2) In the last few years, several approaches have been proposed for AS2. For example, Severyn and Moschitti (2015) applied CNN to create question and answer representations, while others proposed interweighted alignment networks (Shen et al, 2017;Tran et al, 2018;Tay et al, 2018). The use of compare and aggregate architectures has also been extensively evaluated (Wang and Jiang, 2016;Bian et al, 2017;Yoon et al, 2019).…”

Section: Related Workmentioning

confidence: 99%

The Cascade Transformer: an Application for Efficient Answer Sentence Selection

Soldaini¹,

Moschitti²

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference. In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers. Each ranker is used to prune a subset of candidates in a batch, thus dramatically increasing throughput at inference time. Partial encodings from the transformer model are shared among rerankers, providing further speed-up. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy, as measured on two English Question Answering datasets.

show abstract

“…The forward LSTM processes the question from left to right, it outputs sequences − → h t ; while the backward LSTM processes the question in the reverse direction, it outputs sequences ← − h t . Both the forward and backward layer outputs are computed by using the standard LSTM updating equations, Equations (1) - (6). The output at each 2 http://www.ltp-cloud.com/demo/ time step is the concatenation of the two output vectors from both directions, which is calculated by using the following equation:…”

Section: Attention-based Bi-lstmmentioning

confidence: 99%

“…In factoid question answering, the word embedding is trained based on word2vec [45] and Chinese Wikipedia Dump 6 . The dimension of word vectors is set to 300.…”

Section: Evaluation Metrics and Experimental Settingsmentioning

confidence: 99%

See 1 more Smart Citation

A Hybrid Framework for Problem Solving of Comparative Questions

Zhang

et al. 2019

IEEE Access

View full text Add to dashboard Cite

Comparative questions in Chinese, as a special and complex form of question answering (QA), have their own unique sentence structure, existing methods cannot solve them well. Inspired by cognitive studies on how humans solve complex problems, we propose a hybrid framework which combines Logic Programming and attention based Bi-LSTM. This framework is decomposed into three consecutive components: 1) identify comparative questions, 2) extract comparative elements from the identified comparative questions, and 3) answer factoid questions containing the extracted comparative elements. Specifically, for the former two components, Logic Programming is adopted to filter out non-comparative questions and extract comparative elements. For the latter one, a bidirectional long and short term memory (Bi-LSTM) model with attention mechanism is utilized. Experimental results on Chinese geographical question datasets show that our proposed hybrid framework achieves outstanding performance for practical use.

show abstract

The Context-Dependent Additive Recurrent Neural Net

Cited by 19 publications

References 16 publications

Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering

Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering

The Cascade Transformer: an Application for Efficient Answer Sentence Selection

A Hybrid Framework for Problem Solving of Comparative Questions

Contact Info

Product

Resources

About