Questionable Answers in Question Answering Research: Reproducibility and Variability of Published Results

Crane, Matt

doi:10.1162/tacl_a_00018

Cited by 61 publications

(42 citation statements)

References 7 publications

(5 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One should further keep in mind an important caveat in interpreting the results in Table 3: As Reimers and Gurevych (2017) have discussed, non-determinism associated with training neural networks can yield significant differences in accuracy. Crane (2018) further demonstrated that for answer selection in question answering, a range of mundane issues such as software versions can have a significant impact on accuracy, and these effects can be larger than incremental improvements reported in the literature. We adopt the emerging best practice of reporting results from multiple trials, but this makes comparison to previous single-point results difficult.…”

Section: Resultsmentioning

confidence: 90%

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Salman¹,

Shi²,

Lin³

2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

120

View full text Add to dashboard Cite

We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact. Adopting a straightforward decomposition of the problem into entity detection, entity linking, relation prediction, and evidence combination, we explore simple yet strong baselines. On the popular SIMPLEQUESTIONS dataset, we find that basic LSTMs and GRUs plus a few heuristics yield accuracies that approach the state of the art, and techniques that do not use neural networks also perform reasonably well. These results show that gains from sophisticated deep learning techniques proposed in the literature are quite modest and that some previous models exhibit unnecessary complexity.

show abstract

Section: Resultsmentioning

confidence: 90%

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Salman¹,

Shi²,

Lin³

2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

120

View full text Add to dashboard Cite

show abstract

“…In general, combining cross-pair and intra-pair similarities (with kernel sum or meta-classifiers) provides state-of-the-art results without using deep learning. Additionally, the outcome is de-terministic, while the DNN accuracy may vary depending on the type of the hardware used or the random initialization parameters (Crane, 2018). Tables 5, 6 and 7 report the performance of the most recent state-of-the-art systems on WikiQA, TREC13 and SemEval in comparison with our best results.…”

Section: Resultsmentioning

confidence: 99%

Cross-Pair Text Representations for Answer Sentence Selection

Tymoshenko¹,

Moschitti²

2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

High-level semantics tasks, e.g., paraphrasing, textual entailment or question answering, involve modeling of text pairs. Before the emergence of neural networks, this has been mostly performed using intra-pair features, which incorporate similarity scores or rewrite rules computed between the members within the same pair. In this paper, we compute scalar products between vectors representing similarity between members of different pairs, in place of simply using a single vector for each pair. This allows us to obtain a representation specific to any pair of pairs, which delivers the state of the art in answer sentence selection. Most importantly, our approach can outperform much more complex algorithms based on neural networks.

show abstract

“…On WikiQA dataset, our method does not seem to be robust to structural hyperparameter changes. Crane (2018) mentions that on WikiQA dataset a neural matching model (Severyn and Moschitti, 2015) trained with different random seeds can result in differences up to 0.08 in MAP and MRR. We leave the further investigation of the high variance on the WikiQA dataset for further work.…”

Section: Discussionmentioning

confidence: 99%

Simple and Effective Text Matching with Richer Alignment Features

Yang¹,

Zhang²,

Gao³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

150

View full text Add to dashboard Cite

In this paper, we present a fast and strong neural approach for general purpose text matching applications. We explore what is sufficient to build a fast and well-performed text matching model and propose to keep three key features available for inter-sequence alignment: original point-wise features, previous aligned features, and contextual features while simplifying all the remaining components. We conduct experiments on four well-studied benchmark datasets across tasks of natural language inference, paraphrase identification and answer selection. The performance of our model is on par with the state-of-the-art on all datasets with much fewer parameters and the inference speed is at least 6 times faster compared with similarly performed ones.

show abstract

Questionable Answers in Question Answering Research: Reproducibility and Variability of Published Results

Cited by 61 publications

References 7 publications

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Cross-Pair Text Representations for Answer Sentence Selection

Simple and Effective Text Matching with Richer Alignment Features

Contact Info

Product

Resources

About