2001
DOI: 10.1162/089120101300346804
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrapping Morphological Analyzers by Combining Human Elicitation and Machine Learning

Abstract: This paper presents a semiautomatic technique for developing broad-coverage finite-state morphological analyzers for use in natural language processing applications. It consists of three components—elicitation of linguistic information from humans, a machine learning bootstrapping scheme, and a testing environment. The three components are applied iteratively until a threshold of output quality is attained. The initial application of this technique is for the morphology of low-density languages in the context … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2006
2006
2016
2016

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(24 citation statements)
references
References 20 publications
0
23
0
Order By: Relevance
“…a Penn TreeBank POS tag). What they don't yield is a generative model of the language's morphology which would contain information about the position or inflectional classes.Work that is most similar to mine in what it aims for is Oflazer and Gokhan (1996) and Oflazer, Nirenburg and McShane (2001). Oflazer and Gokhan (1996) use constraints to model morphotactics, but the constraints are hand-built and unsupervised learning is used only for segmentation.…”
mentioning
confidence: 99%
“…a Penn TreeBank POS tag). What they don't yield is a generative model of the language's morphology which would contain information about the position or inflectional classes.Work that is most similar to mine in what it aims for is Oflazer and Gokhan (1996) and Oflazer, Nirenburg and McShane (2001). Oflazer and Gokhan (1996) use constraints to model morphotactics, but the constraints are hand-built and unsupervised learning is used only for segmentation.…”
mentioning
confidence: 99%
“…Active learning methods have been applied for constructing FST-based analyzers by eliciting new rules from a user with linguistic expertise (Oflazer et al, 2001;Bosch et al, 2008). These development efforts are fast for rule-based systems, but still require months of work.…”
Section: Active Learning Applied To Morphological Segmentationmentioning
confidence: 99%
“…The learning occurs by approximating the contexts both from specific examples (which may be too narrow) and generalisations (which may be too broad) as context conditions. Oflazer and Nirenburg (1999) and Oflazer et al (2001) present a method for bootstrapping morphological analysers by combining human elicitation and machine learning. Human informants provide the examples used by the machine learning process to deduce rewrite rules necessary for accounting for the data.…”
Section: Past Workmentioning
confidence: 99%