Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of 2006
DOI: 10.3115/1220835.1220855
|View full text |Cite
|
Sign up to set email alerts
|

Effective self-training for parsing

Abstract: We present a simple, but surprisingly effective, method of self-training a twophase parser-reranker system using readily available unlabeled data. We show that this type of bootstrapping is possible for parsing when the bootstrapped parses are processed by a discriminative reranker. Our improved model achieves an f-score of 92.1%, an absolute 1.1% improvement (12% error reduction) over the previous best result for Wall Street Journal parsing. Finally, we provide some analysis to better understand the phenomeno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
325
0
4

Year Published

2007
2007
2017
2017

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 397 publications
(339 citation statements)
references
References 16 publications
(24 reference statements)
4
325
0
4
Order By: Relevance
“…The difficulty of providing sufficient supervision has motivated work on semi-supervised and unsupervised learning for many of these tasks (McClosky et al, 2006;Spitkovsky et al, 2010;Subramanya et al, 2010;Stratos and Collins, 2015;Marinho et al, 2016;Tran et al, 2016), including several that also used autoencoders (Ammar et al, 2014;Lin et al, 2015;Miao and Blunsom, 2016;Kociský et al, 2016;Cheng et al, 2017). In this paper we expand on these works, and suggest a neural CRF autoencoder, that can leverage both labeled and unlabeled data.…”
Section: Related Workmentioning
confidence: 99%
“…The difficulty of providing sufficient supervision has motivated work on semi-supervised and unsupervised learning for many of these tasks (McClosky et al, 2006;Spitkovsky et al, 2010;Subramanya et al, 2010;Stratos and Collins, 2015;Marinho et al, 2016;Tran et al, 2016), including several that also used autoencoders (Ammar et al, 2014;Lin et al, 2015;Miao and Blunsom, 2016;Kociský et al, 2016;Cheng et al, 2017). In this paper we expand on these works, and suggest a neural CRF autoencoder, that can leverage both labeled and unlabeled data.…”
Section: Related Workmentioning
confidence: 99%
“…Although naively adding self-labeled material to extend training data is normally not successful, there have been successful variants of self-learning for parsing as well. For instance, in [16] self-learning is used to improve a twophase parser reranker, with very good results for the classical Wall Street Journal parsing task.…”
Section: Previous Researchmentioning
confidence: 99%
“…McClosky et al (2006a) introduces self-training techniques for two-step parsers. In McClosky et al (2006b), these methods are then used to adapt a parser trained on Wall Street Journal data, without using labeled data from the latter domain.…”
Section: Translation Examplesmentioning
confidence: 99%