Using Left-corner Parsing to Encode Universal Structural Constraints
            in Grammar Induction

There has been recent interest in applying cognitively or empirically motivated bounds on recursion depth to limit the search space of grammar induction models (Ponvert et al., 2011;Noji and Johnson, 2016;Shain et al., 2016). This work extends this depth-bounding approach to probabilistic context-free grammar induction (DB-PCFG), which has a smaller parameter space than hierarchical sequence models, and therefore more fully exploits the space reductions of depth-bounding. Results for this model on grammar acquisition from transcribed childdirected speech and newswire text exceed or are competitive with those of other models when evaluated on parse accuracy. Moreover, grammars acquired from this model demonstrate a consistent use of category labels, something which has not been demonstrated by other acquisition models.

show abstract

“…Grammar acquisition models (Noji and Johnson, 2016;Shain et al, 2016) then restrict this memory to some low bound: e.g. two derivation fragments.…”

Section: Likementioning

confidence: 99%

Unsupervised Grammar Induction with Depth-bounded PCFG

Jin

Doshi‐Velez

Miller

et al. 2018

TACL

View full text Add to dashboard Cite

show abstract

“…For the grammar induction system, we try the implementation of DMV with stop-probability estimation by Mareček and Straka (2013), which is a common baseline for grammar induction (Le and Zuidema, 2015) because it is language-independent, reasonably accurate, fast, and convenient to use. We also try the grammar induction system of Naseem et al (2010), which is the state-of-the-art system on UD (Noji et al, 2016). Naseem et al (2010)'s method, like ours, has prior knowledge of what typical human languages look like.…”

Section: Comparison With Grammar Inductionmentioning

confidence: 99%

Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning

Wang

Eisner

2017

TACL

View full text Add to dashboard Cite

We show how to predict the basic word-order facts of a novel language given only a corpus of part-of-speech (POS) sequences. We predict how often direct objects follow their verbs, how often adjectives follow their nouns, and in general the directionalities of all dependency relations. Such typological properties could be helpful in grammar induction. While such a problem is usually regarded as unsupervised learning, our innovation is to treat it as supervised learning, using a large collection of realistic synthetic languages as training data. The supervised learner must identify surface features of a language's POS sequence (hand-engineered or neural features) that correlate with the language's deeper structure (latent trees). In the experiment, we show: 1) Given a small set of real languages, it helps to add many synthetic languages to the training data. 2) Our system is robust even when the POS sequences include noise. 3) Our system on this task outperforms a grammar induction baseline by a large margin.

show abstract

“…As will become clear in the Experiments section, the basic model discussed previously performs poorly when used for unsupervised parsing, barely outperforming a left-branching baseline for English. We hypothesize the reason is that the basic model is fairly unconstrained: without any constraints to regularize the latent space, the induced parses will be arbitrary, since the model is only trained to maximize sentence likelihood (Naseem et al, 2010;Noji, Miyao, and Johnson, 2016). We therefore introduce posterior regularization (PR; Ganchev et al 2010) to encourage the neural network to generate well-formed trees.…”

Section: Training Objectivementioning

confidence: 99%

Dependency Grammar Induction with a Neural Variational Transition-Based Parser

Cheng

Liu

et al. 2019

AAAI

View full text Add to dashboard Cite

Dependency grammar induction is the task of learning dependency syntax without annotated training data. Traditional graph-based models with global inference achieve state-ofthe-art results on this task but they require O(n 3 ) run time. Transition-based models enable faster inference with O(n) time complexity, but their performance still lags behind. In this work, we propose a neural transition-based parser for dependency grammar induction, whose inference procedure utilizes rich neural features with O(n) time complexity. We train the parser with an integration of variational inference, posterior regularization and variance reduction techniques. The resulting framework outperforms previous unsupervised transition-based dependency parsers and achieves performance comparable to graph-based models, both on the English Penn Treebank and on the Universal Dependency Treebank. In an empirical comparison, we show that our approach substantially increases parsing speed over graphbased models.

show abstract

Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction

Cited by 35 publications

References 17 publications

Unsupervised Grammar Induction with Depth-bounded PCFG

Unsupervised Grammar Induction with Depth-bounded PCFG

Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning

Dependency Grammar Induction with a Neural Variational Transition-Based Parser

Contact Info

Product

Resources

About