ConStance: Modeling Annotation Contexts to Improve Stance Classification

Joseph, Kenneth; Friedland, Lisa; Hobbs, William R.; Lazer, David; Tsur, Oren

doi:10.18653/v1/d17-1116

Cited by 26 publications

(37 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hypothesis: Individual annotators differ in annotation similarity in the contextual presentations, compared to the randomized presentation. Joseph et al in [25] show that while insufficient context results in noisy and uncertain annotations, an overabundance of context may cause the context to outweigh other signals and lead to lower agreement. Further, contextual information biases different people differently on both temporal and intensity metrics [26,27].…”

Section: Questionmentioning

confidence: 99%

Muse-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

Jaiswal

Aldeneh

Bara

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

Emotion recognition algorithms rely on data annotated with high quality labels. However, emotion expression and perception are inherently subjective. There is generally not a single annotation that can be unambiguously declared "correct." As a result, annotations are colored by the manner in which they were collected. In this paper, we conduct crowdsourcing experiments to investigate this impact on both the annotations themselves and on the performance of these algorithms. We focus on one critical question: the effect of context. We present a new emotion dataset, Multimodal Stressed Emotion (MuSE), and annotate the dataset using two conditions: randomized, in which annotators are presented with clips in random order, and contextualized, in which annotators are presented with clips in order. We find that contextual labeling schemes result in annotations that are more similar to a speaker's own self-reported labels and that labels generated from randomized schemes are most easily predictable by automated systems.

show abstract

Section: Questionmentioning

confidence: 99%

Muse-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

Jaiswal

Aldeneh

Bara

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…– Manual annotation may yield subjective and noisy labels . Many factors affect the quality of human-annotations, including: (i) unreliable annotators, (ii) poorly specified annotation tasks and guidelines, (iii) poor category design (categories that are too broad, too narrow, or too vague), or (iv) insufficient information to make a reliable assessment (Cheng and Cosley, 2013 ; Joseph et al, 2017 ). Though the goal of an assessment task is to provide human input, underspecification or appeal to subjective judgment can introduce unintended biases that are often hard to detect.…”

Section: Issues Introduced While Processing Datamentioning

confidence: 99%

Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries

et al. 2019

View full text Add to dashboard Cite

Social data in digital form-including user-generated content, expressed or implicit relations between people, and behavioral traces-are at the core of popular applications and platforms, driving the research agenda of many researchers. The promises of social data are many, including understanding "what the world thinks" about a social issue, brand, celebrity, or other entity, as well as enabling better decision-making in a variety of fields including public policy, healthcare, and economics. Many academics and practitioners have warned against the naïve usage of social data. There are biases and inaccuracies occurring at the source of the data, but also introduced during processing. There are methodological limitations and pitfalls, as well as ethical boundaries and unexpected consequences that are often overlooked. This paper recognizes the rigor with which these issues are addressed by different researchers varies across a wide range. We identify a variety of menaces in the practices around social data use, and organize them in a framework that helps to identify them. "For your own sanity, you have to remember that not all problems can be solved. Not all problems can be solved, but all problems can be illuminated."-Ursula Franklin 1

show abstract

“…Recent work has shown that considering language within the context of user attributes can improve classification accuracy (Volkova et al, 2013;Bamman et al, 2014;Yang and Eisenstein, 2015;Hovy, 2015;Kulkarni et al, 2016;Lynn et al, 2017). Other work has used network or other meta data, such as in Bamman and Smith (2015); Johnson and Goldwasser (2016); Joseph et al (2017); Khattri et al (2015). In a sense these trail-blazing works might be viewed as case studies on user attributes -identifying particular pieces of information for particular tasks where user information has lead to an advantage.…”

Section: Introductionmentioning

confidence: 99%

Tweet Classification without the Tweet: An Empirical Examination of User versus Document Attributes

Lynn¹,

Giorgi²,

Balasubramanian³

et al. 2019

Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science

View full text Add to dashboard Cite

NLP naturally puts a primary focus on leveraging document language, occasionally considering user attributes as supplemental. However, as we tackle more social scientific tasks, it is possible user attributes might be of primary importance and the document supplemental. Here, we systematically investigate the predictive power of user-level features alone versus document-level features for document-level tasks. We first show user attributes can sometimes carry more task-related information than the document itself. For example, a tweet-level stance detection model using only 13 user-level attributes (i.e. features that did not depend on the specific tweet) was able to obtain a higher F1 than the topperforming SemEval participant. We then consider multiple tasks and a wider range of user attributes, showing the performance of strong document-only models can often be improved (as in stance, sentiment, and sarcasm) with user attributes, particularly benefiting tasks with stable "trait-like" outcomes (e.g. stance) most relative to frequently changing "statelike" outcomes (e.g. sentiment). These results not only support the growing work on integrating user factors into predictive systems, but that some of our NLP tasks might be better cast primarily as user-level (or human) tasks.

show abstract

ConStance: Modeling Annotation Contexts to Improve Stance Classification

Cited by 26 publications

References 22 publications

Muse-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

Muse-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries

Tweet Classification without the Tweet: An Empirical Examination of User versus Document Attributes

Contact Info

Product

Resources

About