2022
DOI: 10.1162/tacl_a_00449
|View full text |Cite
|
Sign up to set email alerts
|

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

Abstract: Majority voting and averaging are common approaches used to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases and values, especially in the case of subjective tasks such as detecting affect, aggression, and hate speech. Annotator disagreements may capture important nuances in such tasks that are often ignored while aggregating annotations to a single ground t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
41
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 89 publications
(74 citation statements)
references
References 68 publications
1
41
0
Order By: Relevance
“…merge two labels into one or separate one label into two or more) is another direction for future work. There has been recent work in dealing with bias in annotation [2]. Having an automated assistant to ensure consistent annotations could be a way to avoid bias.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…merge two labels into one or separate one label into two or more) is another direction for future work. There has been recent work in dealing with bias in annotation [2]. Having an automated assistant to ensure consistent annotations could be a way to avoid bias.…”
Section: Resultsmentioning
confidence: 99%
“…Annotations can range from a fixed set of categorical labels that are associated 1-to-1 with data items, to sequential labels that may have order constraints, to complex, multifaceted structure [1], [2], [11], [12]. More recently, captioning tasks involve associating unstructured descriptions as annotations of data.…”
Section: Complex Annotation Tasks and Automationmentioning
confidence: 99%
See 1 more Smart Citation
“…Suresh and Guttag [190] define this bias as a positive value for a measure of divergence between the probability distribution over the input space and the true distribution, noting that it can occur simply as a result of random sampling from a distribution where some groups are in the minority. Others point to the potential for overlooked errors in the labeling process, which is often left undescribed in research papers [73], to lead to overfitting even in the absence of other types of noise [35,152], and the way that data preparation can be lossy whenever majority-rule is used to construct ground truth without preserving information about label distributions [54,93].…”
Section: Data Collection and Preparationmentioning
confidence: 99%
“…• High and unmodeled measurement error [14,59,134] • Data transformations decided contingent on (NHST) results [83,181] • Non-representative [105,143] or underdefined subject samples [88]; insufficient stimuli sampling [87,207,212] • Small samples and noisy measurements (low power) leading to biased estimates [40] • Differential measurement error [39,156,190,216]; unmodeled measurement error [119,127] • Label errors [35,152] and disagreement [54,93] • Data transformations decided contingent on performance comparisons [130] • Underrepresentation of portions of input space in training data [13,157,190] • Input data too huge to understand [19,157] Model representation…”
Section: Data Selection and Preparationmentioning
confidence: 99%