Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1158
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data

Abstract: We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as features in a graph-based parsing model, we observe nice improvements in accuracy when parsing various English datasets.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…In parsing, bilexical preferences have been used by Van Noord (2007) to improve syntactic ambiguity resolution in a Maximum-Entropy parser for Dutch. Kiperwasser and Goldberg (2015) extended bilexical preferences to contextual association scores based on PMI and dependency embeddings (Levy and Goldberg, 2014a) in a graph-based parser. Mirroshandel and Nasr (2016) integrated selectional preferences into a graph-based dependency parser.…”
Section: Relationmentioning
confidence: 99%
“…In parsing, bilexical preferences have been used by Van Noord (2007) to improve syntactic ambiguity resolution in a Maximum-Entropy parser for Dutch. Kiperwasser and Goldberg (2015) extended bilexical preferences to contextual association scores based on PMI and dependency embeddings (Levy and Goldberg, 2014a) in a graph-based parser. Mirroshandel and Nasr (2016) integrated selectional preferences into a graph-based dependency parser.…”
Section: Relationmentioning
confidence: 99%
“…In these approaches, word co-occurrences are defined in terms of dependency contexts (x is the governor of word w), instead of linear contexts (x appears within a range of s around word w). Embedding techniques have also started to be applied to objects other than words, namely on dependency relations (Bansal, 2015;Kiperwasser and Goldberg, 2015).…”
Section: Word Vectors For Dependency Parsingmentioning
confidence: 99%
“…A line of work is devoted to parsing with RNN models, including using RNNs (Miikkulainen, 1996;Mayberry and Miikkulainen, 1999;Legrand and Collobert, 2015;Watanabe and Sumita, 2015) and LSTM (Hochreiter and Schmidhuber, 1997) RNNs Kiperwasser and Goldberg, 2016). Legrand and Collobert (2015) used RNNs to learn conditional distributions over syntactic rules; explored sequenceto-sequence learning (Sutskever et al, 2014) for parsing; utilized characterlevel representations and Kiperwasser and Goldberg (2016) built an easy-first dependency parser using tree-structured compositional LSTMs. However, all these parsers use greedy search and are trained using the maximum likelihood criterion (except Kiperwasser and Goldberg (2016), who used a margin-based objective).…”
Section: Related Workmentioning
confidence: 99%
“…Legrand and Collobert (2015) used RNNs to learn conditional distributions over syntactic rules; explored sequenceto-sequence learning (Sutskever et al, 2014) for parsing; utilized characterlevel representations and Kiperwasser and Goldberg (2016) built an easy-first dependency parser using tree-structured compositional LSTMs. However, all these parsers use greedy search and are trained using the maximum likelihood criterion (except Kiperwasser and Goldberg (2016), who used a margin-based objective). For learning global models, Watanabe and Sumita (2015) used a marginbased objective, which was not optimized for the evaluation metric; although not using RNNs, Weiss et al (2015) proposed a method using the averaged perceptron with beam search (Collins, 2002;Collins and Roark, 2004;Zhang and Clark, 2008), which required fixing the neural network representations, and thus their model parameters were not learned using end-to-end backpropagation.…”
Section: Related Workmentioning
confidence: 99%