An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models

Noji, Hiroshi; Takamura, Hiroya

doi:10.18653/v1/2020.acl-main.309

Cited by 14 publications

(18 citation statements)

References 21 publications

(41 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This suggests that all monolingual models learned the basic facts of agreement, and were able to apply them to the vocabulary items in our materials. At the other end of the spectrum, performance was only slightly higher than chance in the Across an Object Relative Clause condition for all languages except German, suggesting that LSTMs tend to struggle with center embedding-that is, when a subject-verb dependency is nested within another dependency of the same kind (Marvin and Linzen, 2018;Noji and Takamura, 2020).…”

Section: Lstmsmentioning

confidence: 99%

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Mueller¹,

Nicolai²,

Petrou-Zeniou³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

A range of studies have concluded that neural word prediction models can distinguish grammatical from ungrammatical sentences with high accuracy. However, these studies are based primarily on monolingual evidence from English. To investigate how these models' ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation suite for monolingual and multilingual models. CLAMS includes subject-verb agreement challenge sets for English, French, German, Hebrew and Russian, generated from grammars we develop. We use CLAMS to evaluate LSTM language models as well as monolingual and multilingual BERT. Across languages, monolingual LSTMs achieved high accuracy on dependencies without attractors, and generally poor accuracy on agreement across object relative clauses. On other constructions, agreement accuracy was generally higher in languages with richer morphology. Multilingual models generally underperformed monolingual models. Multilingual BERT showed high syntactic accuracy on English, but noticeable deficiencies in other languages.

show abstract

Section: Lstmsmentioning

confidence: 99%

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Mueller¹,

Nicolai²,

Petrou-Zeniou³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…Our work is in this spirit for negations. Noji and Takamura (2020) propose taking advantage of negative examples and unlikelihood in the training of language models to increase their syntactic abilities. Similarly, Min et al (2020) show the effectiveness of syntactic data augmentation in the case of robustness in NLI.…”

Section: Related Workmentioning

confidence: 99%

Understanding by Understanding Not: Modeling Negation in Language Models

Hosseini¹,

Reddy²,

Bahdanau³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Negation is a core construction in natural language. Despite being very successful on many tasks, state-of-the-art pre-trained language models often handle negation incorrectly. To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus. By training BERT with the resulting combined objective we reduce the mean top 1 error rate to 4% on the negated LAMA dataset. We also see some improvements on the negated NLI benchmarks.

show abstract

“…where δ is the margin between the log-likelihoods of x and x * . This was originally proposed for analyzing the syntactic abilities of language models (Noji and Takamura, 2020). This loss is useful for developing better language models.…”

Section: Sentence-level Margin Loss (Sent)mentioning

confidence: 99%

“…Huang et al (2018) introduced a margin loss that estimates the quality of each beam-searched candidate by comparing it with the reference sentence. More recently, Noji and Takamura (2020) showed negative examples help to improve the syntactic ability of neural language models. They created negative instances from original instances by injecting a grammatical error and used them to calculate a margin loss that will be added to the cross-entropy loss.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Learning with Contrastive Examples for Data-to-Text Generation

Uehara¹,

Ishigaki²,

Aoki³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Existing models for data-to-text tasks generate fluent but sometimes incorrect sentences e.g., "Nikkei gains" is generated when "Nikkei drops" is expected. We investigate models trained on contrastive examples, that is, incorrect sentences or terms, in addition to correct ones to reduce such errors. We first create rules to produce contrastive examples from correct ones by replacing frequent crucial terms such as "gain" or "drop". We then use learning methods with several losses that exploit contrastive examples. Experiments on the market comment generation task show that 1) exploiting contrastive examples improves the capability to generate sentences with better lexical choices, without degrading the fluency, 2) the choice of the loss function is an important factor because the performances of different metrics depend on the types of loss functions, and 3) the use of the examples produced by some specific rules further improves performance. Human evaluation also supports the effectiveness of using contrastive examples.

show abstract

An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models

Cited by 14 publications

References 21 publications

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Understanding by Understanding Not: Modeling Negation in Language Models

Learning with Contrastive Examples for Data-to-Text Generation

Contact Info

Product

Resources

About