COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

Kim, Najoung; Linzen, Tal

doi:10.48550/arxiv.2010.05465

Cited by 12 publications

(21 citation statements)

References 32 publications

(14 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The main contributions of this paper are: (1) A study of the Transformer architecture design space, showing which design choices result in an inductive learning bias that leads to compositional generalization across a variety of tasks. (2) state-of-the-art results in some of the datasets used, such as COGS, where we report a classification accuracy of 0.784 using an intermediate representation based on sequence tagging (compared to 0.35 for the best previously reported model (Kim and Linzen, 2020)), and the productivity and systematicity splits of PCFG (Hupkes et al, 2020).…”

Section: Introductionmentioning

confidence: 82%

“…Examples in the structural generalization tasks are typically longer than in the training set and require productivity. All the models tested in the original COGS paper (Kim and Linzen, 2020) (and all of our seq2seq approaches above) achieved 0 accuracy in this category, while performance on lexical tasks is mixed. The small-6s seq2seq model improves the overall performance from 0.278 to 0.475, but curiously has near 0 performance on Verb Argument Structure Alternation tasks, worse than the base abs seq2seq model.…”

Section: Intermediate Representation For Cogsmentioning

confidence: 95%

“…With the goal of creating general models that generalize compositionally in a large range of tasks, in this paper we explore the design space of Transformer models, showing that several design decisions, such as position encodings, decoder type, weight sharing, model hyper-parameters, and formulation of the target task result in different inductive biases, with significant impact for compositional generalization.In order to evaluate the different design decisions, we use a collection of twelve datasets designed ro measure compositional generalization. In addition to six standard datasets commonly used in the literature (such as SCAN (Lake and Baroni, 2018), PCFG (Hupkes et al, 2020), CFQ (Keysers et al, 2019) and COGS (Kim and Linzen, 2020)), we also use a set of basic algorithmic tasks (such as addition, duplication, or set intersection) that although not directly involving natural language, are useful to obtain insights into what can and cannot be learned with different Transformer models.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Making Transformers Solve Compositional Tasks

Ontañón¹,

Ainslie²,

Cvicek³

et al. 2021

Preprint

View full text Add to dashboard Cite

Several studies have reported the inability of Transformer models to generalize compositionally, a key type of generalization in many NLP tasks such as semantic parsing. In this paper we explore the design space of Transformer models showing that the inductive biases given to the model by several design decisions significantly impact compositional generalization. Through this exploration, we identified Transformer configurations that generalize compositionally significantly better than previously reported in the literature in a diverse set of compositional tasks, and that achieve state-of-the-art results in a semantic parsing compositional generalization benchmark (COGS), and a string edit operation composition benchmark (PCFG).

show abstract

Section: Introductionmentioning

confidence: 82%

Section: Intermediate Representation For Cogsmentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Making Transformers Solve Compositional Tasks

Ontañón¹,

Ainslie²,

Cvicek³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…We show that pre-trained convolutions are competitive against pre-trained Transformers via a set of experiments on a potpourri of NLP tasks, like toxicity detection, sentiment classification, news classification, query understanding and semantic parsing/compositional generalization (Kim and Linzen, 2020). Moreover, we find that pretrained convolutions can outperform, in terms of model quality and training speed, state-of-the-art pre-trained Transformers (Raffel et al, 2019) in certain scenarios.…”

Section: Introductionmentioning

confidence: 86%

Are Pre-trained Convolutions Better than Pre-trained Transformers?

Tay

Dehghani

Gupta

et al. 2021

Preprint

View full text Add to dashboard Cite

In the era of pre-trained language models, Transformers are the de facto choice of model architectures.While recent research has shown promise in entirely convolutional, or CNN, architectures, they have not been explored using the pre-train-fine-tune paradigm. In the context of language models, are convolutional models competitive to Transformers when pre-trained? This paper investigates this research question and presents several interesting findings. Across an extensive set of experiments on 8 datasets/tasks, we find that CNN-based pre-trained models are competitive and outperform their Transformer counterpart in certain scenarios, albeit with caveats. Overall, the findings outlined in this paper suggest that conflating pre-training and architectural advances is misguided and that both advances should be considered independently. We believe our research paves the way for a healthy amount of optimism in alternative architectures.

show abstract

“…learning system (Chomsky, 2009;Lake, 2014;Lake et al, 2019). However, a range of curated datasets have revealed that standard deep neural networks struggle to generalize compositionally to novel utterances not seen during training (Shridhar et al, 2020;Kim & Linzen, 2020;Lake & Baroni, 2018).…”

Section: Introductionmentioning

confidence: 99%

Recursive Decoding: A Situated Cognition Approach to Compositional Generation in Grounded Language Understanding

Setzler¹,

Howland²,

Phillips³

2022

Preprint

View full text Add to dashboard Cite

Compositional generalization is a troubling blind spot for neural language models. Recent efforts have presented techniques for improving a model's ability to encode novel combinations of known inputs, but less work has focused on generating novel combinations of known outputs. Here we focus on this latter "decode-side" form of generalization in the context of gSCAN, a synthetic benchmark for compositional generalization in grounded language understanding. We present Recursive Decoding (RD), a novel procedure for training and using seq2seq models, targeted towards decode-side generalization. Rather than generating an entire output sequence in one pass, models are trained to predict one token at a time. Inputs (i.e., the external gSCAN environment) are then incrementally updated based on predicted tokens, and re-encoded for the next decoder time step. RD thus decomposes a complex, out-ofdistribution sequence generation task into a series of incremental predictions that each resemble what the model has already seen during training. RD yields dramatic improvement on two previously neglected generalization tasks in gSCAN. We provide analyses to elucidate these gains over failure of a baseline, and then discuss implications for generalization in naturalistic grounded language understanding, and seq2seq more generally.

show abstract

COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

Cited by 12 publications

References 32 publications

Making Transformers Solve Compositional Tasks

Making Transformers Solve Compositional Tasks

Are Pre-trained Convolutions Better than Pre-trained Transformers?

Recursive Decoding: A Situated Cognition Approach to Compositional Generation in Grounded Language Understanding

Contact Info

Product

Resources

About