Proceedings of the 10th Linguistic Annotation Workshop Held In Conjunction With ACL 2016 (LAW-X 2016) 2016
DOI: 10.18653/v1/w16-1709
|View full text |Cite
|
Sign up to set email alerts
|

Different Flavors of GUM: Evaluating Genre and Sentence Type Effects on Multilayer Corpus Annotation Quality

Abstract: Genre and domain are well known covariates of both manual and automatic annotation quality. Comparatively less is known about the effect of sentence types, such as imperatives, questions or fragments, and how they interact with text type effects. Using mixed effects models, we evaluate the relative influence of genre and sentence types on automatic and manual annotation quality for three related tasks in English data: POS tagging, dependency parsing and coreference resolution. For the latter task, we also deve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…We see three directions for future research in this space. This type of quantitative characterization of semantic relations could be extended to other genres [2][3][4]. Alternatively, additional semantic or pragmatic relations could be annotated at both the sentence and document level [9].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We see three directions for future research in this space. This type of quantitative characterization of semantic relations could be extended to other genres [2][3][4]. Alternatively, additional semantic or pragmatic relations could be annotated at both the sentence and document level [9].…”
Section: Discussionmentioning
confidence: 99%
“…Multi-layered corpora are corpora that are annotated for multiple, mutually independent layers of natural language information on the same text [1][2][3][4][5]. While the contents of one layer may not be directly and immediately inferred from the contents of another, there may nonetheless be some correlation between elements of one layer and another.…”
Section: Related Research 21 Multi-layered Corporamentioning
confidence: 99%
“…One reviewer has asked how StanfordNLP compares to other available libraries, such as Spacy (https://spacy.io/). While we do not have up to date numbers for Spacy, which was not featured in the recent CoNLL shared task on Universal Dependencies parsing, the most recent numbers reported in(Zeldes and Simonson, 2016) do not suggest that it would outperform StanfordNLP.…”
mentioning
confidence: 81%
“…Interest in this question re-emerged recently. For example, focusing on annotation difficulty, Zeldes and Simonson (2016) remark "that domain adaptation may be folding in sentence type effects", motivated by earlier findings by Silveira et al (2014) who remark that "[t]he most striking difference between the two types of data [Web and newswire] has to do with imperatives, which occur two orders of magnitude more often in the EWT [English Web Treebank]." A very recent paper examines word order properties and their impact on parsing taking a control experiment approach (Gulordava and Merlo, 2016).…”
Section: Fortuitous Datamentioning
confidence: 99%