Constrained Language Models Yield Few-Shot Semantic Parsers

Shin, Richard; Lin, Christopher H.; Thomson, Sam; Chen, Charles; Roy, Subhro; Platanios, Emmanouil Antonios; Pauls, Adam; Klein, Dan; Eisner, Jason; Durme, Benjamin Van

doi:10.18653/v1/2021.emnlp-main.608

Cited by 76 publications

(87 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Such approaches include prompt engineering through the use of manual patterns (Petroni et al, 2019;Schick and Schütze, 2021), and also methods for extracting either hard (Shin et al, 2020;Haviv et al, 2021) or soft (Li and Liang, 2021;Zhong et al, 2021;Qin and Eisner, 2021) prompts automatically. Shin et al (2021) used GPT-3 to select training examples for in-context learning. However, their focus was not on training a prompt retriever, but instead on representing logical forms with a pseudo-language, and applying constraints that are based on the formal language at decoding time to improve generation.…”

Section: Discussionmentioning

confidence: 99%

“…Conversely, our approach takes advantage of the generative LM itself and is thus more general. Shin et al (2021) used GPT-3 to select examples for the prompt in the context of few-shot semantic parsing. However, rather than training a retriever, they randomly sample a large set of question-program pairs from the training set, and choose those that are similar to the target instance question according to GPT-3.…”

Section: Retriever Indexmentioning

confidence: 99%

See 1 more Smart Citation

Learning To Retrieve Prompts for In-Context Learning

Rubin¹,

Herzig²,

Berant³

2021

Preprint

View full text Add to dashboard Cite

In-context learning is a recent paradigm in natural language understanding, where a large pre-trained language model (LM) observes a test instance and a few training examples as its input, and directly decodes the output without any update to its parameters. However, performance has been shown to strongly depend on the selected training examples (termed prompt). In this work, we propose an efficient method for retrieving prompts for in-context learning using annotated data and a LM. Given an input-output pair, we estimate the probability of the output given the input and a candidate training example as the prompt, and label training examples as positive or negative based on this probability. We then train an efficient dense retriever from this data, which is used to retrieve training examples as prompts at test time. We evaluate our approach on three sequence-to-sequence tasks where language utterances are mapped to meaning representations, and find that it substantially outperforms prior work and multiple baselines across the board.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Retriever Indexmentioning

confidence: 99%

Learning To Retrieve Prompts for In-Context Learning

Rubin¹,

Herzig²,

Berant³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Still, as discussed in §3.1.1, we remark that for users proficient in the grammar formalism, curating a handful of idiomatic production rules is still more efficient than labeling parallel samples to exhaustively cover compositional logical patterns and diverse language style, and the size of annotated samples required could be orders-of-magnitude larger compared to the size of the grammar. Meanwhile, the process of creating production rules could potentially be simplified by allowing users to define them using natural language instead of λ-calculus logical rules, similar in the spirit of the studies on naturalizing programs using canonical language (Wang et al, 2017;Shin et al, 2021;Herzig et al, 2021).…”

Section: Limitations and Discussionmentioning

confidence: 99%

“…Like our model, such methods use lexicons to capture alignments between NL phrases and logical predicates (Goldwasser et al, 2011), while our method does not require real utterances. Finally, methods based on OVERNIGHT (Wang et al, 2015) synthesize parallel corpora from SCFGs (Cheng et al, 2019;Xu et al, 2020a) or neural sequence models (Guo et al, 2018), and attempt to bridge the gaps between canonical and real utterances via paraphrase detection and generation (Su and Yan, 2017;Shin et al, 2021), or representation learning (Marzoev et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

On The Ingredients of an Effective Zero-shot Semantic Parser

Yin¹,

Wieting²,

Sil³

et al. 2021

Preprint

View full text Add to dashboard Cite

Semantic parsers map natural language utterances into meaning representations (e.g. programs). Such models are typically bottlenecked by the paucity of training data due to the required laborious annotation efforts. Recent studies have performed zero-shot learning by synthesizing training examples of canonical utterances and programs from a grammar, and further paraphrasing these utterances to improve linguistic diversity. However, such synthetic examples cannot fully capture patterns in real data. In this paper we analyze zero-shot parsers through the lenses of the language and logical gaps (Herzig and Berant, 2019), which quantify the discrepancy of language and programmatic patterns between the synthetic canonical examples and real-world user-issued ones. We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods using canonical examples that most likely reflect real user intents. Our model achieves strong performance on two semantic parsing benchmarks (SCHOLAR, GEO) with zero labeled data.

show abstract

“…Neural semantic parsing uses sequence models to formalize natural language sentences (Kamath and Das 2019). Shin et al (2021) show that PTLMs are zero-shot parsers, and that intermediate steps which rephrase and streamline the original input before parsing it to a formal language improve accuracy.…”

Section: Related Workmentioning

confidence: 99%

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Betz¹,

Richardson²

2021

Preprint

View full text Add to dashboard Cite

In this paper, we present and implement a multi-dimensional, modular framework for performing deep argument analysis (DeepA2) using current pre-trained language models (PTLMs). ArgumentAnalyst -a T5 model (Raffel et al. 2020) set up and trained within DeepA2 -reconstructs argumentative texts, which advance an informal argumentation, as valid arguments: It inserts, e.g., missing premises and conclusions, formalizes inferences, and coherently links the logical reconstruction to the source text. We create a synthetic corpus for deep argument analysis, and evaluate ArgumentAnalyst on this new dataset as well as on existing data, specifically En-tailmentBank (Dalvi et al. 2021). Our empirical findings vindicate the overall framework and highlight the advantages of a modular design, in particular its ability to emulate established heuristics (such as hermeneutic cycles), to explore the model's uncertainty, to cope with the plurality of correct solutions (underdetermination), and to exploit higher-order evidence.

show abstract

Constrained Language Models Yield Few-Shot Semantic Parsers

Cited by 76 publications

References 49 publications

Learning To Retrieve Prompts for In-Context Learning

Learning To Retrieve Prompts for In-Context Learning

On The Ingredients of an Effective Zero-shot Semantic Parser

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Contact Info

Product

Resources

About