TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for
            Reading Comprehension

Joshi, Mandar; Choi, Eunsol; Weld, Daniel S.; Zettlemoyer, Luke

doi:10.18653/v1/p17-1147

Cited by 1,074 publications

(1,077 citation statements)

References 20 publications

Supporting

Mentioning

1,067

Contrasting

Unclassified

Order By: Relevance

“…These subjects are: medicine, (4k of questions), history (3k of questions), biology (2k of questions). The resulting dataset is somewhat similar to Trivia QA dataset, however the domains are different [8].…”

Section: Multiple Choice Question Answering (Mcqa)mentioning

confidence: 86%

Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

Попов

Pugachev

Svyatokum

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We investigate the performance of sentence embeddings models on several tasks for the Russian language. In our comparison, we include such tasks as multiple choice question answering, next sentence prediction, and paraphrase identification. We employ FastText embeddings as a baseline and compare it to ELMo and BERT embeddings. We conduct two series of experiments, using both unsupervised (i.e., based on similarity measure only) and supervised approaches for the tasks. Finally, we present datasets for multiple choice question answering and next sentence prediction in Russian.

show abstract

Section: Multiple Choice Question Answering (Mcqa)mentioning

confidence: 86%

Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

Попов

Pugachev

Svyatokum

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…TriviaQA (Joshi et al, 2017) contains automatically collected question-answer pairs from 14 trivia and quiz-league websites, together with webcrawled evidence documents from Wikipedia and Bing. While a majority of questions require world knowledge for finding the correct answer, it is mostly factual knowledge.…”

Section: Related Workmentioning

confidence: 99%

SemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge

Ostermann¹,

Roth²,

Modi³

et al. 2018

Proceedings of the 12th International Workshop on Semantic Evaluation

110

101

View full text Add to dashboard Cite

This report summarizes the results of the SemEval 2018 task on machine comprehension using commonsense knowledge. For this machine comprehension task, we created a new corpus, MCScript. It contains a high number of questions that require commonsense knowledge for finding the correct answer. 11 teams from 4 different countries participated in this shared task, most of them used neural approaches. The best performing system achieves an accuracy of 83.95%, outperforming the baselines by a large margin, but still far from the human upper bound, which was found to be at 98%.

show abstract

“…CoQA [17] is a large-scale reading comprehension dataset which contains questions that depend on a conversation history. TriviaQA [21] and SQuAD 2.0 [9] pay attention to complex reasoning questions, which means that we need to jointly infer the answers via multiple sentences. Compared with English datasets, Chinese reading comprehension datasets are quite rare.…”

Section: Reading Comprehension Datasetsmentioning

confidence: 99%

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

Duan¹,

Wang²,

Wang³

et al. 2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

We present a Chinese judicial reading comprehension (CJRC) dataset which contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts. The CJRC dataset can help researchers extract elements by reading comprehension technology. Element extraction is an important task in the legal field. However, it is difficult to predefine the element types completely due to the diversity of document types and causes of action. By contrast, machine reading comprehension technology can quickly extract elements by answering various questions from the long document. We build two strong baseline models based on BERT and BiDAF. The experimental results show that there is enough space for improvement compared to human annotators.

show abstract

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

Cited by 1,074 publications

References 20 publications

Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

Evaluation of Sentence Embedding Models for Natural Language Understanding Problems in Russian

SemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

Contact Info

Product

Resources

About