Studying the Inductive Biases of

Ravfogel, Shauli; Goldberg, Yoav; Linzen, Tal

doi:10.18653/v1/n19-1356

Cited by 29 publications

(19 citation statements)

References 24 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Because we have the same amount of training data per-language in the same domain, this could point to the importance of having explicit cues to lin-guistic structure such that models can learn that structure. While more language varieties need to be evaluated to determine whether this trend is robust, we note that this finding is consistent with that of Ravfogel et al (2019), who compared English to a synthetic variety of English augmented with case markers and found that the addition of case markers increased LSTM agreement prediction accuracy.…”

Section: Morphological Complexity Vs Accuracysupporting

confidence: 83%

“…Dhar and Bisazza (2018) trained a multilingual LM on a concatenated French and Italian corpus, and tested whether grammatical abilities transfer across languages. Ravfogel et al (2018) reported an in-depth analysis of LSTM LM performance on agreement prediction in Basque, and Ravfogel et al (2019) investigated the effect of different syntactic properties of a language on RNNs' agreement prediction accuracy by creating synthetic variants of English. Finally, grammatical evaluation has been proposed for machine translation systems for languages such as German and French (Sennrich, 2017;Isabelle et al, 2017).…”

Section: Grammatical Evaluation Beyond Englishmentioning

confidence: 99%

See 1 more Smart Citation

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Mueller¹,

Nicolai²,

Petrou-Zeniou³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

A range of studies have concluded that neural word prediction models can distinguish grammatical from ungrammatical sentences with high accuracy. However, these studies are based primarily on monolingual evidence from English. To investigate how these models' ability to learn syntax varies by language, we introduce CLAMS (Cross-Linguistic Assessment of Models on Syntax), a syntactic evaluation suite for monolingual and multilingual models. CLAMS includes subject-verb agreement challenge sets for English, French, German, Hebrew and Russian, generated from grammars we develop. We use CLAMS to evaluate LSTM language models as well as monolingual and multilingual BERT. Across languages, monolingual LSTMs achieved high accuracy on dependencies without attractors, and generally poor accuracy on agreement across object relative clauses. On other constructions, agreement accuracy was generally higher in languages with richer morphology. Multilingual models generally underperformed monolingual models. Multilingual BERT showed high syntactic accuracy on English, but noticeable deficiencies in other languages.

show abstract

Section: Morphological Complexity Vs Accuracysupporting

confidence: 83%

Section: Grammatical Evaluation Beyond Englishmentioning

confidence: 99%

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Mueller¹,

Nicolai²,

Petrou-Zeniou³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

View full text Add to dashboard Cite

show abstract

“…This result indicates some architectural limitations of LSTM-LMs in handling object RCs robustly at a near perfect level. Answering why the accuracy does not reach (almost) 100%, perhaps with other empirical properties or inductive biases (Khandelwal et al, 2018;Ravfogel et al, 2019) is future work.…”

Section: Limitations Of Lstm-lmsmentioning

confidence: 99%

An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models

Noji

Takamura

2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

We explore the utilities of explicit negative examples in training neural language models. Negative examples here are incorrect words in a sentence, such as barks in *The dogs barks. Neural language models are commonly trained only on positive examples, a set of sentences in the training data, but recent studies suggest that the models trained in this way are not capable of robustly handling complex syntactic constructions, such as long-distance agreement. In this paper, we first demonstrate that appropriately using negative examples about particular constructions (e.g., subject-verb agreement) will boost the model's robustness on them in English, with a negligible loss of perplexity. The key to our success is an additional margin loss between the log-likelihoods of a correct word and an incorrect word. We then provide a detailed analysis of the trained models. One of our findings is the difficulty of object-relative clauses for RNNs. We find that even with our direct learning signals the models still suffer from resolving agreement across an object-relative clause. Augmentation of training sentences involving the constructions somewhat helps, but the accuracy still does not reach the level of subjectrelative clauses. Although not directly cognitively appealing, our method can be a tool to analyze the true architectural limitation of neural models on challenging linguistic constructions.

show abstract

“…However, Kuncoro et al (2018) have also shown that although sequential LSTMs can learn syntactic information, a recursive neural network that explicitly models hierarchy (the Recurrent Neural Network Grammar model from Dyer et al [2015]) is better at this: It performs better on the number agreement task from Linzen, Dupoux, and Goldberg (2016). In addition, Ravfogel, Goldberg, and Tyers (2018) and Ravfogel, Goldberg, and Linzen (2019) have cast some doubts on the results by Linzen, Dupoux, and Goldberg (2016) and Gulordava et al (2018) by looking at Basque and synthetic languages with different word orders, respectively, in the two studies.…”

Section: Recursive Vs Recurrent Neural Networkmentioning

confidence: 99%

What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?

Lhoneux

Stymne

Nivre

2021

Computational Linguistics

View full text Add to dashboard Cite

There is a growing interest in investigating what neural NLP models learn about language. A prominent open question is the question of whether or not it is necessary to model hierarchical structure. We present a linguistic investigation of a neural parser adding insights to this question. We look at transitivity and agreement information of auxiliary verb constructions (AVCs) in comparison to finite main verbs (FMVs). This comparison is motivated by theoretical work in dependency grammar and in particular the work of Tesnière (1959) where AVCs and FMVs are both instances of a nucleus, the basic unit of syntax. An AVC is a dissociated nucleus, it consists of at least two words, and an FMV is its non-dissociated counterpart, consisting of exactly one word. We suggest that the representation of AVCs and FMVs should capture similar information. We use diagnostic classifiers to probe agreement and transitivity information in vectors learned by a transition-based neural parser in four typologically different languages. We find that the parser learns different information about AVCs and FMVs if only sequential models (BiLSTMs) are used in the architecture but similar information when a recursive layer is used. We find explanations for why this is the case by looking closely at how information is learned in the network and looking at what happens with different dependency representations of AVCs. We conclude that there may be benefits to using a recursive layer in dependency parsing and that we have not yet found the best way to integrate it in our parsers.

show abstract

Studying the Inductive Biases of

Cited by 29 publications

References 24 publications

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models

What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?

Contact Info

Product

Resources

About