Proceedings of the 36th Annual Meeting on Association for Computational Linguistics - 1998
DOI: 10.3115/980691.980692
|View full text |Cite
|
Sign up to set email alerts
|

A freely available morphological analyzer, disambiguator and context sensitive lemmatizer for German

Abstract: In this paper we present Morphy, an integrated tool for German morphology, part-ofspeech tagging and context-sensitive lemmatization. Its large lexicon of more than 320,000 word forms plus its ability to process German compound nouns guarantee a wide morphological coverage. Syntactic ambiguities can be resolved with a standard statistical part-of-speech tagger. By using the output of the tagger, the lemmatizer can determine the correct root even for ambiguous word forms. The complete package is freely availabl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0
1

Year Published

1999
1999
2020
2020

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(18 citation statements)
references
References 6 publications
(7 reference statements)
0
17
0
1
Order By: Relevance
“…(According to Lezius, Rapp, & Wettler, 1998, 93% of the tokens of a German text had only one lemma.) Although we had a context-sensitive lemmatizer for German available (Lezius, Rapp, & Wettler, 1998), this was not the case for English, so for reasons of symmetry we decided not to use the context feature. I In cases in which an ambiguous word can be both a content and a function word (e.g., can), preference was given to those interpretations that appeared to occur more frequently.…”
Section: Pre-processingmentioning
confidence: 99%
“…(According to Lezius, Rapp, & Wettler, 1998, 93% of the tokens of a German text had only one lemma.) Although we had a context-sensitive lemmatizer for German available (Lezius, Rapp, & Wettler, 1998), this was not the case for English, so for reasons of symmetry we decided not to use the context feature. I In cases in which an ambiguous word can be both a content and a function word (e.g., can), preference was given to those interpretations that appeared to occur more frequently.…”
Section: Pre-processingmentioning
confidence: 99%
“…For example, the accusative and plural markers must occur in a specific order after a word ending morpheme, and a word cannot end with a root or affix. Maximal morpheme matching has been incorporated into morphological analysis systems for other agglutinative languages, including German (for compound nouns only) (Lezius et al, 1998) and Turkish (Solak and Oflazer, 1992). Thus, it is valuable to directly compare McBurnett's approach to other approaches of Esperanto morphological segmentation.…”
Section: Esperantomentioning
confidence: 99%
“…investigates how off-the-shelf POS taggers can be combined to better cope with text material that differs from the type of text the taggers were originally trained, and for which there are no readily available training corpora. The author uses three taggers for German -TreeTagger ), Morphy (Lezius et al 1998) and QTAG . He evaluates the taggers and creates a list of differences between taggers and a hypothesis about which parameters were likely to influence tagger performance.…”
Section: Combining Pos-taggersmentioning
confidence: 99%