Abstract:Syntactic parsing requires a fine balance between expressivity and complexity, so that naturally occurring structures can be accurately parsed without compromising efficiency. In dependency-based parsing, several constraints have been proposed that restrict the class of permissible structures, such as projectivity, planarity, multi-planarity, well-nestedness, gap degree, and edge degree. While projectivity is generally taken to be too restrictive for natural language syntax, it is not clear which of the other … Show more
“…However, while some classes of dependency structures tolerating certain crossings have a very good empirical coverage [31,[42][43][44], these proposals still face counterexamples that fall outside the restrictions [45][46][47].…”
The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies. Interestingly, crossing syntactic dependencies have been observed to be infrequent in human languages. This leads to the question of whether the scarcity of crossings in languages arises from an independent and specific constraint on crossings. We provide statistical evidence suggesting that this is not the case, as the proportion of dependency crossings of sentences from a wide range of languages can be accurately estimated by a simple predictor based on a null hypothesis on the local probability that two dependencies cross given their lengths. The relative error of this predictor never exceeds 5% on average, whereas the error of a baseline predictor assuming a random ordering of the words of a sentence is at least 6 times greater. Our results suggest that the low frequency of crossings in natural languages is neither originated by hidden knowledge of language nor by the undesirability of crossings per se, but as a mere side effect of the principle of dependency length minimization.
“…However, while some classes of dependency structures tolerating certain crossings have a very good empirical coverage [31,[42][43][44], these proposals still face counterexamples that fall outside the restrictions [45][46][47].…”
The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies. Interestingly, crossing syntactic dependencies have been observed to be infrequent in human languages. This leads to the question of whether the scarcity of crossings in languages arises from an independent and specific constraint on crossings. We provide statistical evidence suggesting that this is not the case, as the proportion of dependency crossings of sentences from a wide range of languages can be accurately estimated by a simple predictor based on a null hypothesis on the local probability that two dependencies cross given their lengths. The relative error of this predictor never exceeds 5% on average, whereas the error of a baseline predictor assuming a random ordering of the words of a sentence is at least 6 times greater. Our results suggest that the low frequency of crossings in natural languages is neither originated by hidden knowledge of language nor by the undesirability of crossings per se, but as a mere side effect of the principle of dependency length minimization.
“…[9] define an interval as the set [i, j] := {k ∈ V |i ≤ k and k ≤ j}, where i and j are endpoints, and V is a set of nodes as defined in Section 3.1. Non-projective structures violate this constraint.…”
Section: Minimalist Grammars and Block Degreementioning
Abstract. This paper provides an interpretation of Minimalist Grammars [16,17] in terms of dependency structures. Under this interpretation, merge operations derive projective dependency structures, and movement operations introduce both non-projectivity and illnestedness. This new characterization of the generative capacity of Minimalist Grammar makes it possible to discuss the linguistic relevance of non-projectivity and illnestedness. This in turn provides insight into grammars that derive structures with these properties.
“…Finally, we make a projectivity assumption, which is supported by empirical data in many languages (Kuhlmann and Nivre, 2006;Havelka, 2007), and makes a model computationally less expensive. A dependency parse D of a sentence W = w 1 , .…”
Section: Direct Correspondence Assumption and Syntactic Cohesion In Smtmentioning
This paper describes a novel target-side syntactic language model for phrase-based statistical machine translation, bilingual structured language model. Our approach represents a new way to adapt structured language models (Chelba and Jelinek, 2000) to statistical machine translation, and a first attempt to adapt them to phrasebased statistical machine translation. We propose a number of variations of the bilingual structured language model and evaluate them in a series of rescoring experiments. Rescoring of 1000-best translation lists produces statistically significant improvements of up to 0.7 BLEU over a strong baseline for Chinese-English, but does not yield improvements for ArabicEnglish.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.