Visually Grounded Continual Learning of Compositional Phrases

Jin, Xisen; Du, Junyi; Sadhu, Arka; Nevatia, Ram; Ren, Xiang

doi:10.18653/v1/2020.emnlp-main.158

Cited by 10 publications

(10 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Is there a dataset that can prevent all shortcuts? Our automatic method for creating contrast sets allows us to ask those questions, while we believe that future work in better training mechanisms, as suggested in and Jin et al (2020), could help in making more robust models.…”

Section: Discussionmentioning

confidence: 99%

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

Bitton¹,

Stanovsky²,

Schwartz³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Recent works have shown that supervised models often exploit data artifacts to achieve good test scores while their performance severely degrades on samples outside their training distribution. Contrast sets quantify this phenomenon by perturbing test samples in a minimal way such that the output label is modified. While most contrast sets were created manually, requiring intensive annotation effort, we present a novel method which leverages rich semantic input representation to automatically generate contrast sets for the visual question answering task. Our method computes the answer of perturbed questions, thus vastly reducing annotation cost and enabling thorough evaluation of models' performance on various semantic aspects (e.g., spatial or relational reasoning). We demonstrate the effectiveness of our approach on the popular GQA dataset (Hudson and Manning, 2019) and its semantic scene graph image representation. We find that, despite GQA's compositionality and carefully balanced label distribution, two strong models drop 13-17% in accuracy on our automatically-constructed contrast set compared to the original validation set. Finally, we show that our method can be applied to the training set to mitigate the degradation in performance, opening the door to more robust models. 1

show abstract

Section: Discussionmentioning

confidence: 99%

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

Bitton¹,

Stanovsky²,

Schwartz³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…As with different characteristics of different computer vision problems, a simple adaptation of methods proposed for image classification may not lead to satisfactory performance in other computer vision problems. For example, in video grounding, Jin et al [218] pointed out that simple adaptation of ideas from image classification fails with this compositional phrases learning scenario of the language input. In visual question answering (VQA), Perez et al [219] pointed out that their model cannot preserve previously learned knowledge well after trained continuously on objects with different colors.…”

Section: Discussionmentioning

confidence: 99%

Recent Advances of Continual Learning in Computer Vision: An Overview

Qu,

Rahmani,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Cao et al (2021) propose a new Continual Learning framework for NMT models, while Ke et al (2021) proposes a novel capsule network based model called B-CL (Bert based Continual Learning) for sentiment classification tasks. Jin et al (2020) show how existing Continual Learning algorithms fail at learning compositional phrases. More recently Sun et al (2019) propose a lifelong learning method LAMOL that is capable of continually learning new tasks by replaying pseudo-samples of previous tasks that require no extra memory or model capacity.…”

Section: Related Workmentioning

confidence: 98%

Fine-tuned Language Models are Continual Learners

Scialom¹,

Chakrabarty²,

Muresan³

2022

Preprint

View full text Add to dashboard Cite

Recent work on large language models relies on the intuition that most natural language processing tasks can be described via natural language instructions. Language models trained on these instructions show strong zero-shot performance on several standard datasets. However, these models even though impressive still perform poorly on a wide range of tasks outside of their respective training and evaluation sets. To address this limitation, we argue that a model should be able to keep extending its knowledge and abilities, without forgetting previous skills. In spite of the limited success of Continual Learning we show that Language Models can be continual learners. We empirically investigate the reason for this success and conclude that Continual Learning emerges from self-supervision pre-training. Our resulting model Continual-T0 (CT0) is able to learn diverse new tasks, while still maintaining good performance on previous tasks, spanning remarkably through 70 datasets in total. Finally, we show that CT0 is able to combine instructions in ways it was never trained for, demonstrating some compositionality.

show abstract

Visually Grounded Continual Learning of Compositional Phrases

Cited by 10 publications

References 28 publications

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

Recent Advances of Continual Learning in Computer Vision: An Overview

Fine-tuned Language Models are Continual Learners

Contact Info

Product

Resources

About