Katrin Ortmann scite author profile

Katrin Ortmann

4Publications

8Citation Statements Received

75Citation Statements Given

How they've been cited

How they cite others

Affiliations

Ruhr University Bochum

Publications

Order By: Most citations

Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus

Laarmann-Quante

Ortmann

Ehlert

et al. 2017

View full text Add to dashboard Cite

NLP applications for learners often rely on annotated learner corpora. Thereby, it is important that the annotations are both meaningful for the task, and consistent and reliable. We present a new longitudinal L1 learner corpus for German (handwritten texts collected in grade 2-4), which is transcribed and annotated with a target hypothesis that strictly only corrects orthographic errors, and is thereby tailored to research and tool development for orthographic issues in primary school. While for most corpora, transcription and target hypothesis are not evaluated, we conducted a detailed inter-annotator agreement study for both tasks. Although we achieved high agreement, our discussion of cases of disagreement shows that even with detailed guidelines, annotators differ here and there for different reasons, which should also be considered when working with transcriptions and target hypotheses of other corpora, especially if no explicit guidelines for their construction are known.

show abstract

The Litkey Corpus: A richly annotated longitudinal corpus of German texts written by primary school children

et al. 2019

View full text Add to dashboard Cite

Compared to early language development, later changes to the language system during orthography and literacy acquisition have not yet been researched in detail. We present a longitudinal corpus of texts on short picture stories written by German primary school children between grades 2 and 4 and grades 3 and 4. It includes 1,922 texts with 212,505 tokens (6,364 types) from 251 children. For each text, rich metadata is available, including age, grade and linguistic background (at least 60% of the children were multilingual). To our knowledge, our corpus is the largest longitudinal corpus of written texts by children at primary school age. Each word is included in its original spelling as well as in a normalized form (target hypothesis), specifying the intended word form, which we corrected for orthographic but not grammatical errors. Original and target word forms are aligned characterwise and the target word forms are enriched with phonological, syllabic, and morphological information. Additionally, for each target word form, we established key lexical variables, e.g., word frequency or summed bigram frequency, as specified in childLex. Where applicable, we also specify key features of German orthography (e.g., consonant doubling, vowellengthening ). Taken together, this information allows for a detailed assessment of the properties of words that tend to increase the likelihood of spelling errors. The corpus is available in different formats-as tab-delimited annotated token and type based lists, in an XML format, and via the corpus search tool ANNIS.

show abstract

Variation between Different Discourse Types: Literate vs. Oral

Ortmann¹,

Dipper²

2019

View full text Add to dashboard Cite

This paper deals with the automatic identification of literate and oral discourse in German texts. A range of linguistic features is selected and their role in distinguishing between literate-and oral-oriented registers is investigated, using a decision-tree classifier. It turns out that all of the investigated features are related in some way to oral conceptuality. Especially simple measures of complexity (average sentence and word length) are prominent indicators of oral and literate discourse. In addition, features of reference and deixis (realized by different types of pronouns) also prove to be very useful in determining the degree of orality of different registers.

show abstract

Comments of the German Insurance Association GDV on the European Commission’s proposals for a revised Product Liability Directive (PLD-P) and an Artificial Intelligence Liability Directive (AILD-P)

Ortmann¹

2023

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Katrin Ortmann

Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus

The Litkey Corpus: A richly annotated longitudinal corpus of German texts written by primary school children

Variation between Different Discourse Types: Literate vs. Oral

Comments of the German Insurance Association GDV on the European Commission’s proposals for a revised Product Liability Directive (PLD-P) and an Artificial Intelligence Liability Directive (AILD-P)

Contact Info

Product

Resources

About