2007
DOI: 10.1007/978-3-540-70939-8_10
|View full text |Cite
|
Sign up to set email alerts
|

Morphological Disambiguation of Turkish Text with Perceptron Algorithm

Abstract: Abstract. This paper describes the application of the perceptron algorithm to the morphological disambiguation of Turkish text. Turkish has a productive derivational morphology. Due to the ambiguity caused by complex morphology, a word may have multiple morphological parses, each with a different stem or sequence of morphemes. The methodology employed is based on ranking with perceptron algorithm which has been successful in some NLP tasks in English. We use a baseline statistical trigram-based model of a prev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
45
1

Year Published

2010
2010
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 54 publications
(47 citation statements)
references
References 13 publications
1
45
1
Order By: Relevance
“…3 We used 10 different classification algorithms in the WEKA toolkit. The results of the classification algorithms and the previous approaches are given in Table 3.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…3 We used 10 different classification algorithms in the WEKA toolkit. The results of the classification algorithms and the previous approaches are given in Table 3.…”
Section: Resultsmentioning
confidence: 99%
“…The Perceptron Algorithm [3] is a combination of statistical and machine learning approaches. They use the Baseline Trigram-Based Model to generate n-best parses for each sentence.…”
Section: Morphological Disambiguationmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, we generate the normalized word forms from the now disambiguated sequence of morphemes. Our initial results are comparable to morphological disambiguation on Turkish texts, despite the fact that we have a much smaller training corpus (∼ 2800 sentences, compared to over 50,000 (Görgün and Yildiz, 2011) and 45,000 sentences (Sak et al, 2007)). A possible explanation is that Turkish morphology is more complex: Turkish has more productive suffixes than Quechua, and there are relatively complex morpho-phonological rules that determine word formation, such as two dimensional vowel harmony and context-sensitive realizations of consonants (Oflazer, 1994).…”
Section: Discussionmentioning
confidence: 55%
“…There are also several constraint-based methods for disambiguation [18,19]. Another method employs a perceptron algorithm for morphological disambiguation [20]. We use the tool produced by this study as a morphological parser ranging from preparing the corpus to the online question generation.…”
Section: Related Workmentioning
confidence: 99%