PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Scholak, Torsten; Schucher, Nathan; Bahdanau, Dzmitry

doi:10.18653/v1/2021.emnlp-main.779

Cited by 121 publications

(114 citation statements)

References 12 publications

Supporting

Mentioning

113

Contrasting

Order By: Relevance

“…For SQL semantic parsing (Spider, SParC, and CoSQL), a large number of errors is caused by invalid outputs, and the number of invalid outputs gradually decreases with the increase of model size. This phenomenon is also observed by Scholak et al (2021), who further used the PICARD method to improve output validity, largely improving the parsing performance. For s-expression semantic parsing (GrailQA and We-bQSP), invalid predictions take up 30-50% of all incorrect predictions, and increasing the model size does not significantly reduce invalidity.…”

Section: Error Analysissupporting

confidence: 56%

See 1 more Smart Citation

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

Xie¹,

Wu²,

Shi³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation by proposing the UNIFIEDSKG framework, which unifies 21 SKG tasks into a text-to-text format, aiming to promote systematic SKG research, instead of being exclusive to a single task, domain, or dataset. We use UNIFIEDSKG to benchmark T5 with different sizes and show that T5, with simple modifications when necessary, achieves state-of-the-art performance on almost all of the 21 tasks. We further demonstrate that multi-task prefix-tuning improves the performance on most tasks, largely improving the overall performance. UNIFIEDSKG also facilitates the investigation of zero-shot and fewshot learning, and we show that T0, GPT-3, and Codex struggle in zero-shot and fewshot learning for SKG. We also use UNI-FIEDSKG to conduct a series of controlled experiments on structured knowledge encoding variants across SKG tasks. UNIFIEDSKG is easily extensible to more tasks, and it is open-sourced at https://github.com/ hkunlp/unifiedskg. 1

show abstract

Section: Error Analysissupporting

confidence: 56%

“…Some semantic parsing sota models, denoted as + in Table 2, are also T5 with post hoc modification, e.g., constrained decoding (Scholak et al, 2021) or reranking (Ye et al, 2021b). We conclude that T5, with simple modification when necessary, achieves sota on almost all the tasks.…”

Section: Experiments and Results On Individual Tasksmentioning

confidence: 84%

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

Xie¹,

Wu²,

Shi³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…This approach achieves 32.57 BLEU on the CoNaLa dataset compared to the same set up without TAE which scores 30.98 BLEU. Scholak et al (2021) propose PICARD a simple and effective decoder-constraint algorithm that works with pretrained encoder-decoder models. Using PICARD with a T5-3B model (Raffel et al, 2019b) achieves state of the art on two SQL generation tasks from NL Spider (Yu et al, 2018) and CoSQL (Yu et al, 2019a).…”

Section: Pretrained Transformer Modelsmentioning

confidence: 99%

A Survey on Artificial Intelligence for Source Code: A Dialogue Systems Perspective

Al-Hossami¹,

Shaikh²

2022

Preprint

View full text Add to dashboard Cite

In this survey paper, we overview major deep learning methods used in Natural Language Processing (NLP) and source code over the last 35 years. Next, we present a survey of the applications of Artificial Intelligence (AI) for source code, also known as Code Intelligence (CI) and Programming Language Processing (PLP). We survey over 287 publications and present a software-engineering centered taxonomy for CI placing each of the works into one category describing how it best assists the software development cycle. Then, we overview the field of conversational assistants and their applications in software engineering and education. Lastly, we highlight research opportunities at the intersection of AI for code and conversational assistants and provide future directions for researching conversational assistants with CI capabilities.

show abstract

“…Others have proposed to use a generative model like BART to augment the dataset by paraphrasing natural language utterances (Xu et al, 2020b). Recently, it has been shown that T5 can be successfully fine-tuned on a large-scale text-to-sql dataset (Shaw et al, 2021;Scholak et al, 2021).…”

Section: Semantic Parsingmentioning

confidence: 99%

The Power of Prompt Tuning for Low-Resource Semantic Parsing

Schucher¹,

Reddy²,

Vries³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Prompt tuning has recently emerged as an effective method for adapting pre-trained language models to a number of language tasks. In this paper, we investigate prompt tuning for semantic parsing, the task of mapping natural language utterances onto formal meaning representations. For large T5 models we find (i) that prompt tuning significantly outperforms fine-tuning in the low data regime and (ii) that canonicalization-i.e. naturalizing the meaning representations-barely improves performance. This last result is surprising as it suggests that large T5 models can be modulated to generate sequences that are far from the pretraining distribution.

show abstract

PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Cited by 121 publications

References 12 publications

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

A Survey on Artificial Intelligence for Source Code: A Dialogue Systems Perspective

The Power of Prompt Tuning for Low-Resource Semantic Parsing

Contact Info

Product

Resources

About