Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs

Linjordet, Trond; Balog, Krisztian

doi:10.1145/3409256.3409836

Cited by 10 publications

(8 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We compare model performance on the original versus rewritten NL questions in our samples. Specifically, we use neural KGQA models trained on the DBNQA * [16] and GrailQA [11] datasets. 1 The question we seek to answer is how quality improvements on the input NL questions impact the answer prediction effectiveness of the models.…”

Section: Methodsmentioning

confidence: 99%

“…The extant datasets are taken as the basis to extract templates for both formal queries and NL questions, and those templates are then instantiated with different entity and predicate bindings. DBNQA* [16] partitions DBNQA [12] into training, validation, and test splits based on the underlying templates, avoiding leakage of information between training and test splits. The instances are identical to DBNQA, and so we use DBNQA* in our experiments.…”

Section: Related Workmentioning

confidence: 99%

“…We also choose the datasets so that all of the most common KGs are represented in the formal query bindings. Specifically, we consider the datasets DBNQA * [16], LC-QuAD v2.0 [9], and GrailQA [11]. We then randomly sample 25 NL questions from each of these datasets.…”

Section: Preliminary Analysismentioning

confidence: 99%

See 2 more Smart Citations

Would You Ask it that Way? Measuring and Improving Question Naturalness for Knowledge Graph Question Answering

Linjordet,

Balog

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Knowledge graph question answering (KGQA) facilitates information access by leveraging structured data without requiring formal query language expertise from the user. Instead, users can express their information needs by simply asking their questions in natural language (NL). Datasets used to train KGQA models that would provide such a service are expensive to construct, both in terms of expert and crowdsourced labor. Typically, crowdsourced labor is used to improve template-based pseudo-natural questions generated from formal queries. However, the resulting datasets often fall short of representing genuinely natural and fluent language. In the present work, we investigate ways to characterize and remedy these shortcomings. We create the IQN-KGQA test collection by sampling questions from existing KGQA datasets and evaluating them with regards to five different aspects of naturalness. Then, the questions are rewritten to improve their fluency. Finally, the performance of existing KGQA models is compared on the original and rewritten versions of the NL questions. We find that some KGQA systems fare worse when presented with more realistic formulations of NL questions. The IQN-KGQA test collection is a resource to help evaluate KGQA systems in a more realistic setting. The construction of this test collection also sheds light on the challenges of constructing large-scale KGQA datasets with genuinely NL questions.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Would You Ask it that Way? Measuring and Improving Question Naturalness for Knowledge Graph Question Answering

Linjordet,

Balog

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…More broadly, in Knowledge-Graph Question Answering (KG-QA), work has exploited KG to generate synthetic data in unseen domains (Linjordet, 2020;Trivedi et al, 2017;Linjordet and Balog, 2020). Our work extends visually-grounded questions with valid common sense KG triplets.…”

Section: Related Workmentioning

confidence: 99%

FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering

Lin¹,

Wang²,

Byrne³

2023

Preprint

View full text Add to dashboard Cite

The widely used Fact-based Visual Question Answering (FVQA) dataset contains visuallygrounded questions that require information retrieval using common sense knowledge graphs to answer. It has been observed that the original dataset is highly imbalanced and concentrated on a small portion of its associated knowledge graph. We introduce FVQA 2.0 which contains adversarial variants of test questions to address this imbalance. We show that systems trained with the original FVQA train sets can be vulnerable to adversarial samples and we demonstrate an augmentation scheme to reduce this vulnerability without human annotations.

show abstract

“…In practice, though, the datasets often still contain redundant data points. The redundancies inherent in text data, such as paraphrases, synonyms, etc., can be especially problematic, resulting in train-test leaks [19,24,29]. For instance, the training and test sets of the ELI5 dataset [13] for question answering were created using TF-IDF as a heuristic to eliminate redundancies between them.…”

Section: Background and Related Workmentioning

confidence: 99%

How Train-Test Leakage Affects Zero-shot Retrieval

Fröbe¹,

Akiki²,

Potthast³

et al. 2022

Preprint

View full text Add to dashboard Cite

Neural retrieval models are often trained on (subsets of) the millions of queries of the MS MARCO / ORCAS datasets and then tested on the 250 Robust04 queries or other TREC benchmarks with often only 50 queries. In such setups, many of the few test queries can be very similar to queries from the huge training data-in fact, 69% of the Ro-bust04 queries have near-duplicates in MS MARCO / ORCAS. We investigate the impact of this unintended train-test leakage by training neural retrieval models on combinations of a fixed number of MS MARCO / OR-CAS queries that are highly similar to the actual test queries and an increasing number of other queries. We find that leakage can improve effectiveness and even change the ranking of systems. However, these effects diminish as the amount of leakage among all training instances decreases and thus becomes more realistic.

show abstract

Sanitizing Synthetic Training Data Generation for Question Answering over Knowledge Graphs

Cited by 10 publications

References 31 publications

Would You Ask it that Way? Measuring and Improving Question Naturalness for Knowledge Graph Question Answering

Would You Ask it that Way? Measuring and Improving Question Naturalness for Knowledge Graph Question Answering

FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering

How Train-Test Leakage Affects Zero-shot Retrieval

Contact Info

Product

Resources

About