Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories 2020
DOI: 10.18653/v1/2020.tlt-1.1
|View full text |Cite
|
Sign up to set email alerts
|

Clause-Level Tense, Mood, Voice and Modality Tagging for German

Abstract: We present a language-independent clausizer (clause splitter) based on Universal Dependencies (Nivre et al., 2016), and a clause-level tagger for grammatical tense, mood, voice and modality in German. The paper recapitulates verbal inflection in German-always juxtaposed with its close relative English-and transforms the linguistic theory into a rule-based algorithm. We achieve state-of-the-art accuracies of 92.6% for tense, 79.0% for mood, 93.8% for voice and 79.8% for modality in the literary domain. Our impl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 19 publications
(27 reference statements)
0
5
0
Order By: Relevance
“…Furthermore, a lot of the datasets in the shared task incorporate automatically created dependency trees (created by models trained on UD treebanks), which may lead to follow-up errors in the clause-splitting step. Dönicke (2020) reports an F1 of 81% for predicting clauses in a German text after preprocessing it with a spaCy model trained on the German UD treebanks. Even though this number only gives a rough estimate on how well our system identifies clauses, there is clearly room for improvement.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, a lot of the datasets in the shared task incorporate automatically created dependency trees (created by models trained on UD treebanks), which may lead to follow-up errors in the clause-splitting step. Dönicke (2020) reports an F1 of 81% for predicting clauses in a German text after preprocessing it with a spaCy model trained on the German UD treebanks. Even though this number only gives a rough estimate on how well our system identifies clauses, there is clearly room for improvement.…”
Section: Resultsmentioning
confidence: 99%
“…5 Algorithm 1 shows an updated version of the original algorithm that has been modified to work with a broader range of languages, specifically the languages in the shared task. In the following, the algorithm is briefly described, with a focus on the adaptions made for multiple languages (numbers in parentheses refer to lines in the pseudocode); for further explanations see Dönicke (2020).…”
Section: Feature Vectorsmentioning
confidence: 99%
See 1 more Smart Citation
“…The Python package spaCy offers a solution, but reports question its reliability for German text [23]. In fact, determining the tense in German text is almost a research project in itself [22]. Given the limited scope and resources of this research project, tense as an explanatory aspect is not further considered.…”
Section: ) Psycholinguistic Word Propertiesmentioning
confidence: 99%
“…Clause segmentation is performed with the clausizer presented inDönicke (2020). The manually created clause-level annotations are then automatically mapped to the detected clauses 4.…”
mentioning
confidence: 99%