Smarnet: Teaching Machines to Read and Comprehend Like Human

Chen, Zheqian; Yang, Rongqin; Cao, Bin; Zhao, Zhou; Cai, Deng; He, Xiaofei

doi:10.48550/arxiv.1710.02772

Cited by 5 publications

(5 citation statements)

References 16 publications

(1 reference statement)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the web domain, except for the verified F 1 scores, we see a similar trend. Surprisingly, we outperform approaches which use multi-layer recurrent pointer networks with specialized memories (Chen et al, 2017b;Hu et al, 2017) ) is significantly better than our final model. Although we found it could obtain high results, it was less consistent across different runs and gave lower scores on average (49.30) compared to our approach averaged over 4 runs (51.03).…”

Section: Resultsmentioning

confidence: 77%

See 1 more Smart Citation

Multi-Mention Learning for Reading Comprehension with Neural Cascades

Swayamdipta¹,

Parikh²,

Kwiatkowski³

2017

Preprint

View full text Add to dashboard Cite

Reading comprehension is a challenging task, especially when executed across longer or across multiple evidence documents, where the answer is likely to reoccur. Existing neural architectures typically do not scale to the entire evidence, and hence, resort to selecting a single passage in the document (either via truncation or other means), and carefully searching for the answer within that passage. However, in some cases, this strategy can be suboptimal, since by focusing on a specific passage, it becomes difficult to leverage multiple mentions of the same answer throughout the document. In this work, we take a different approach by constructing lightweight models that are combined in a cascade to find the answer. Each submodel consists only of feed-forward networks equipped with an attention mechanism, making it trivially parallelizable. We show that our approach can scale to approximately an order of magnitude larger evidence documents and can aggregate information at the representation level from multiple mentions of each answer candidate across the document. Empirically, our approach achieves stateof-the-art performance on both the Wikipedia and web domains of the TriviaQA dataset, outperforming more complex, recurrent architectures. * Work done during internship at Google NY. 1 Even though the answer string itself might occur in the truncated document.

show abstract

Section: Resultsmentioning

confidence: 77%

“…Pointer networks with multi-hop reasoning, and syntactic and NER features, have been used recently in three architectures -Smarnet (Chen et al, 2017b), Reinforced Mnemonic Reader (Hu et al, 2017) and MEMEN (Pan et al, 2017) for both SQuAD and TriviaQA. Most of the above also use document truncation .…”

Section: Related Approachesmentioning

confidence: 99%

Multi-Mention Learning for Reading Comprehension with Neural Cascades

Swayamdipta¹,

Parikh²,

Kwiatkowski³

2017

Preprint

View full text Add to dashboard Cite

show abstract

“…Finally, the outputs of match-LSTM in two directions are concatenated together and are later fed to answer prediction module. In addition, R-Net [88], IA Reader [72] and Smarnet [8] also utilize RNNs to update the query-aware context representations to perform multi-hop interaction.…”

Section: (2b) Multi-hop Interactionmentioning

confidence: 99%

“…In the Smarnet model, Chen et al [8] not only use gate mechanism to control the question influence on the context, but also introduce another gate mechanism to refine query representations with the knowledge of context. The combination of these two gated-attention mechanisms implements the alternant reading between the context and question with mutual information.…”

Section: (2b) Multi-hop Interactionmentioning

confidence: 99%

Neural Machine Reading Comprehension: Methods and Trends

et al. 2019

View full text Add to dashboard Cite

Machine Reading Comprehension (MRC), which requires the machine to answer questions based on the given context, has gained increasingly wide attention with the incorporation of various deep learning techniques over the past few years. Although the research of MRC based on deep learning is flourishing, there remains a lack of a comprehensive survey to summarize existing approaches and recent trends, which motivates our work presented in this article. Specifically, we give a thorough review of this research field, covering different aspects including (1) typical MRC tasks: their definitions, differences and representative datasets; (2) general architecture of neural MRC: the main modules and prevalent approaches to each of them; and (3) new trends: some emerging focuses in neural MRC as well as the corresponding challenges. Last but not least, in retrospect of what has been achieved so far, the survey also envisages what the future may hold by discussing the open issues left to be addressed. * Work in progress.

show abstract

“…One possible explanation is that short passages give fewer options of simple questions, such as "when", "who", or "how many", and the annotators of the dataset had to resort to more elaborate alternatives. [42] Model EM (%) F1 (%) Reinforced Mnemonic Reader [3] 79.545 86.654 MEMEN [4] 78.234 85.344 FRC [32] 76.240 84.599 RaSoR + TR + LM [5] 77.583 84.163 Stochastic Answer Networks [6] 76.828 84.396 r-net [7] 76.461 84.265 FusionNet [8] 75.968 83.900 DCN+ [9] 75.087 83.081 Conductor-net [10] 74.405 82.742 BiDAF + Self Attention [11] 72.139 81.048 smartnet [12] 71.415 80.160 Ruminating Reader [13] 70.639 79.456 jNet [14] 70.607 79.821 ReasoNet [15] 70.555 79.364 Document Reader [16] 70.733 79.353 RaSoR [17] 70.849 78.741 FastQAExt [18] 70.849 78.857 Multi-Perspective Matching [19] 70.387 78.784 SEDT [20] 68.163 77.527 FABIR (Ours) 67.744 77.605 BiDAF [21] 67.974 77.323 Dynamic Coattention Networks [22] 66.233 77.896 Match-LSTM with Bi-Ans-Ptr [23] 64.744 73.743 Fine-Grained Gating [24] 62.446 73.327 OTF dict+spelling [25] 64.083 73.056 Dynamic Chunk Reader [26] 62.499 70.956…”

Section: Fabir and Bidaf Statisticsmentioning

confidence: 99%

A Fully Attention-Based Information Retriever

Correia

Silva

Martins

et al. 2018

2018 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

Recurrent neural networks are now the state-ofthe-art in natural language processing because they can build rich contextual representations and process texts of arbitrary length. However, recent developments on attention mechanisms have equipped feedforward networks with similar capabilities, hence enabling faster computations due to the increase in the number of operations that can be parallelized. We explore this new type of architecture in the domain of question-answering and propose a novel approach that we call Fully Attention Based Information Retriever (FABIR). We show that FABIR achieves competitive results in the Stanford Question Answering Dataset (SQuAD) while having fewer parameters and being faster at both learning and inference than rival methods.

show abstract

Smarnet: Teaching Machines to Read and Comprehend Like Human

Cited by 5 publications

References 16 publications

Multi-Mention Learning for Reading Comprehension with Neural Cascades

Multi-Mention Learning for Reading Comprehension with Neural Cascades

Neural Machine Reading Comprehension: Methods and Trends

A Fully Attention-Based Information Retriever

Contact Info

Product

Resources

About