Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Shi, Peng; Ng, Patrick; Wang, Zhiguo; Zhu, Henghui; Li, Alexander Hanbo; Wang, Jun; Santos, Cícero Nogueira dos; Xiang, Bing

doi:10.1609/aaai.v35i15.17627

Cited by 47 publications

(18 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…( 9) Global-GNN [3] proposes a semantic parser that globally reasons about the structure of the output query to make a more contextually informed selection of database constants. Many previous works design adaptive PLM models for specific Text-to-SQL models to achieve better result, such as GAP [35], GRAPPA [45], STRUG [9]. For fair comparison, except for comparing the result on a unified pre-training model BERT-large, we also report the result with model adaptive PLM.…”

Section: Methodsmentioning

confidence: 99%

Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

Liu

et al. 2022

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

The generalizability to new databases is of vital importance to Text-to-SQL systems which aim to parse human utterances into SQL statements. Existing works achieve this goal by leveraging the exact matching method to identify the lexical matching between the question words and the schema items. However, these methods fail in other challenging scenarios, such as the synonym substitution in which the surface form differs between the corresponding question words and schema items. In this paper, we propose a framework named ISESL-SQL to iteratively build a semantic enhanced schema-linking graph between question tokens and database schemas. First, we extract a schema linking graph from PLMs through a probing procedure in an unsupervised manner. Then the schema linking graph is further optimized during the training process through a deep graph learning method. Meanwhile, we also design an auxiliary task called graph regularization to improve the schema information mentioned in the schema-linking graph. Extensive experiments on three benchmarks demonstrate that ISESL-SQL could consistently outperform the baselines and further investigations show its generalizability and robustness.

show abstract

Section: Methodsmentioning

confidence: 99%

Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

Liu

et al. 2022

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

show abstract

“…To make the task feasible, we keep the condition values (e.g., prescription name) the same in this work, except for genders and vital signs, as this is another major challenge in semantic parsing [30,29]. When splitting the dataset into train, validation, and test sets, we ensure that all the question templates are present in each split.…”

Section: Taskmentioning

confidence: 99%

“…We choose MIMICSQL, which contains simple SQL queries that can all be parsed with the Spider grammar, and EHRSQL for the healthcare datasets. Among many SOTA models in the Spider leaderboard, we use Generation-Augmented Pre-training (GAP) [29] to test its zero-shot domain transfer performance.…”

Section: Model Developmentmentioning

confidence: 99%

EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records

Lee¹,

Hwang²,

Bae³

et al. 2023

Preprint

View full text Add to dashboard Cite

We present a new text-to-SQL dataset for electronic health records (EHRs). The utterances were collected from 222 hospital staff, including physicians, nurses, insurance review and health records teams, and more. To construct the QA dataset on structured EHR data, we conducted a poll at a university hospital and templatized the responses to create seed questions. Then, we manually linked them to two open-source EHR databases-MIMIC-III and eICU-and included them with various time expressions and held-out unanswerable questions in the dataset, which were all collected from the poll. Our dataset poses a unique set of challenges: the model needs to 1) generate SQL queries that reflect a wide range of needs in the hospital, including simple retrieval and complex operations such as calculating survival rate, 2) understand various time expressions to answer time-sensitive questions in healthcare, and 3) distinguish whether a given question is answerable or unanswerable based on the prediction confidence. We believe our dataset, EHRSQL, could serve as a practical benchmark to develop and assess QA models on structured EHR data and take one step further towards bridging the gap between text-to-SQL research and its real-life deployment in healthcare.36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.

show abstract

“…Among them, TaBERT (Yin et al 2020) and TAPAS (Herzig et al 2020) design structure-related unsupervised objectives for further pretraining the BERT model over millions of web tables and the surrounding text. GRAPPA (Yu et al 2021) and GAP (Shi et al 2021) utilize data augmentation techniques to synthesize high-quality pretraining corpus and respectively pretrain a RoBERTa and a BART model.…”

Section: Semantic Parsingmentioning

confidence: 99%

“…Despite the promising performance, current PLM-based approaches most regard both input and output as plain text sequences and neglect the structural information contained in the sentences (Yin et al 2020;Shi et al 2021), such as the database (DB) or knowledge base (KB) schema that essentially constitutes the key semantics of the target SQL or SPARQL logical forms. As a result, these PLM-based models often suffer from the hallucination issue (Ji et al 2022) and may generate incorrect logical form structures that are unfaithful to the input utterance (Nicosia, Qu, and Altun 2021;Gupta et al 2022).…”

Section: Introductionmentioning

confidence: 99%

Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing

Nie¹,

Jiuding²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

The recent prevalence of pretrained language models (PLMs) has dramatically shifted the paradigm of semantic parsing, where the mapping from natural language utterances to structured logical forms is now formulated as a Seq2Seq task. Despite the promising performance, previous PLM-based approaches often suffer from hallucination problems due to their negligence of the structural information contained in the sentence, which essentially constitutes the key semantics of the logical forms. Furthermore, most works treat PLM as a black box in which the generation process of the target logical form is hidden beneath the decoder modules, which greatly hinders the model's intrinsic interpretability. To address these two issues, we propose to incorporate the current PLMs with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks, namely Semantic Anchor Extraction and Semantic Anchor Alignment, for training the hierarchical decoders and probing the model intermediate representations in a self-adaptive manner alongside the fine-tuning process. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines. More importantly, by analyzing the intermediate representations of the hierarchical decoders, our approach also makes a huge step toward the intrinsic interpretability of PLMs in the domain of semantic parsing.

show abstract

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Cited by 47 publications

References 50 publications

Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph

EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records

Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing

Contact Info

Product

Resources

About