Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of 2006
DOI: 10.3115/1220835.1220876
|View full text |Cite
|
Sign up to set email alerts
|

Prototype-driven learning for sequence models

Abstract: We investigate prototype-driven learning for primarily unsupervised sequence modeling. Prior knowledge is specified declaratively, by providing a few canonical examples of each target annotation label. This sparse prototype information is then propagated across a corpus using distributional similarity features in a log-linear generative model. On part-of-speech induction in English and Chinese, as well as an information extraction task, prototype features provide substantial error rate reductions over competit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
102
0

Year Published

2009
2009
2015
2015

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 80 publications
(104 citation statements)
references
References 11 publications
2
102
0
Order By: Relevance
“…Despite much previous work (Smith and Eisner, 2005;Johnson, 2007;Toutanova and Johnson, 2007;Haghighi and Klein, 2006;Berg-Kirkpatrick et al, 2010), results on this task are complicated by varying assumptions and unclear evaluation metrics (Christodoulopoulos et al, 2010). Perhaps most importantly, they are not good enough to be practical.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite much previous work (Smith and Eisner, 2005;Johnson, 2007;Toutanova and Johnson, 2007;Haghighi and Klein, 2006;Berg-Kirkpatrick et al, 2010), results on this task are complicated by varying assumptions and unclear evaluation metrics (Christodoulopoulos et al, 2010). Perhaps most importantly, they are not good enough to be practical.…”
Section: Introductionmentioning
confidence: 99%
“…Perhaps most importantly, they are not good enough to be practical. Even with indirect supervision, for example the prototype-driven method of Haghighi and Klein (2006) which assumes a set of word examples for each tag type, the best perposition accuracy remains in the range of mid-70%.…”
Section: Introductionmentioning
confidence: 99%
“…Similarly, Wu and Srihari [34] assigned labels to unlabeled documents with 'labeled features' and then use these pseudo-examples in conjunction with labeled examples to train a weighted margin Support Vector Machine with regularization. Later Haghighi and Klein [36] explored similar "labeled features" for a "pseudo-example" strategy of training a generative MRF sequence model.…”
Section: Semi-supervised Learning With Labeled Featuresmentioning
confidence: 99%
“…To address this, we propose a novel voting scheme that is inspired by the widely-used 1-to-1 accuracy metric for POS induction (Haghighi and Klein, 2006). This metric maps system tags to gold tags to maximize accuracy with the constraint that each gold tag is mapped to at most once.…”
Section: System Combinationmentioning
confidence: 99%
“…For this reason, EM is typically only used to train log-linear model weights when Z(θ) = 1, e.g., for hidden Markov models, probabilistic context-free grammars, and models composed of locally-normalized log-linear models (Berg-Kirkpatrick et al, 2010), among others. There have been efforts at approximating the summation over elements of X, whether by limiting sequence length (Haghighi and Klein, 2006), only summing over observations in the training data (Riezler, 1999), restricting the observation space based on the task , or using Gibbs sampling to obtain an unbiased sample of the full space (Della Pietra et al, 1997;Rosenfeld, 1997).…”
Section: Em and Contrastive Estimationmentioning
confidence: 99%