Partially-Aligned Data-to-Text Generation with Distant Supervision

Fu, Zihao; Shi, Bei; Lam, Wai; Liu, Zhiyuan

doi:10.18653/v1/2020.emnlp-main.738

Cited by 11 publications

(24 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…al. [186] propose the adaptation of the seq2seq framework for their partially-algined dataset WITA using a supportiveness adaptor and a rebalanced beam search. The pre-trained adaptor calculates supportiveness scores for each word in the generated text with respect to the input.…”

Section: Regularization Techniquesmentioning

confidence: 99%

Innovations in Neural Data-to-text Generation

Sharma¹,

Gogineni²,

Ramakrishnan³

2022

Preprint

View full text Add to dashboard Cite

The neural boom that has sparked natural language processing (NLP) research through the last decade has similarly led to significant innovations in data-to-text generation (DTG). This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating DTG from the rest of the natural language generation (NLG) landscape, encompassing an up-to-date synthesis of the literature, and highlighting the stages of technological adoption from within and outside the greater NLG umbrella. With this holistic view, we highlight promising avenues for DTG research that not only focus on the design of linguistically capable systems but also systems that exhibit fairness and accountability.Index Terms-Data-to-text generation (DTG), natural language generation (NLG) ! 1. This survey exclusively focuses on academic innovations for datato-text generation as the technologies underlying commercial frameworks are often proprietary.

show abstract

Section: Regularization Techniquesmentioning

confidence: 99%

Innovations in Neural Data-to-text Generation

Sharma¹,

Gogineni²,

Ramakrishnan³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Gardent et al (2017) introduced the WebNLG challenge, which aimed to generate text from a small set of RDF knowledge triples (no more than 7) that are well-aligned with the text. To avoid the high cost of preparing such well-aligned data, researchers also studied how to leverage automatically obtained partially-aligned data in which some portion of the output text cannot be generated from the input triples (Fu et al, 2020b). introduced AGENDA dataset, which aimed to generate paper abstract from a title and a small KG built by information extraction system on the abstracts and has at most 7 relations.…”

Section: Related Workmentioning

confidence: 99%

ENT-DESC: Entity Description Generation by Exploring Knowledge Graph

Cheng

Wu²,

Zhang

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

Previous works on knowledge-to-text generation take as input a few RDF triples or keyvalue pairs conveying the knowledge of some entities to generate a natural language description. Existing datasets, such as WIKIBIO, WebNLG, and E2E, basically have a good alignment between an input triple/pair set and its output text. However, in practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge. In this paper, we introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text. Our dataset involves retrieving abundant knowledge of various types of main entities from a large knowledge graph (KG), which makes the current graph-to-sequence models severely suffer from the problems of information loss and parameter explosion while generating the descriptions. We address these challenges by proposing a multi-graph structure that is able to represent the original graph information more comprehensively. Furthermore, we also incorporate aggregation methods that learn to extract the rich graph information. Extensive experiments demonstrate the effectiveness of our model architecture. 1 * Liying Cheng is under the Joint Ph.D. Program between Alibaba and Singapore University of Technology and Design.† Dekun Wu was a visiting student at SUTD. Yan Zhang and Zhanming Jie were interns at Alibaba.

show abstract

“…Wiseman et al (2017) generate basketball match descriptions based on the game records. Moreover, Fu et al (2020c) propose to directly train the model on partially-aligned data called WITA while Fu et al (2020b) propose to train a model based on purely unaligned data unsupervised with a dual learning framework. All of the above problems aim at converting some formatted data into natural language texts facilitating more understandability.…”

Section: Related Workmentioning

confidence: 99%

Dynamic Topic Tracker for KB-to-Text Generation

Lam

Jameel

2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Recently, many KB-to-text generation tasks have been proposed to bridge the gap between knowledge bases and natural language by directly converting a group of knowledge base triples into human-readable sentences. However, most of the existing models suffer from the off-topic problem, namely, the models are prone to generate some unrelated clauses that are somehow involved with certain input terms regardless of the given input data. This problem seriously degrades the quality of the generation results. In this paper, we propose a novel dynamic topic tracker for solving this problem. Different from existing models, our proposed model learns a global hidden representation for topics and recognizes the corresponding topic during each generation step. The recognized topic is used as additional information to guide the generation process and thus alleviates the off-topic problem. The experimental results show that our proposed model can enhance the performance of sentence generation and the off-topic problem is significantly mitigated.

show abstract

Partially-Aligned Data-to-Text Generation with Distant Supervision

Cited by 11 publications

References 25 publications

Innovations in Neural Data-to-text Generation

Innovations in Neural Data-to-text Generation

ENT-DESC: Entity Description Generation by Exploring Knowledge Graph

Dynamic Topic Tracker for KB-to-Text Generation

Contact Info

Product

Resources

About