Finding Patterns in Noisy Crowds: Regression-based Annotation
            Aggregation for Crowdsourced Data

Parde, Natalie; Nielsen, Rodney D.

doi:10.18653/v1/d17-1204

Cited by 11 publications

(10 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A range of options have been developed for improving crowdsourcing quality. The most common approach is to collect multiple annotations and then aggregate them (Hovy et al, 2013;Passonneau and Carpenter, 2014;Parde and Nielsen, 2017;Dumitrache et al, 2018). This can identify inconsistencies, but at significant cost as each example must be annotated multiple times.…”

Section: Crowdsourcing Qualitymentioning

confidence: 99%

“…Training on data with these issues will lead to lower quality models, which in turn decrease the effectiveness of the overall dialog system. Most research on improving data quality has focused on mechanisms such as aggregation (Parde and Nielsen, 2017), worker filtering (Li and Liu, 2015), and attention checks (Oppenheimer et al, 2009). These all raise costs and primarily address clear inconsistencies (such as in examples 1, 4, 5, and 6) but not more subtle cases like the inclusion of "dollar" in examples 2 and 3.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods

Larson¹,

Cheung²,

Mahendran³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

Slot-filling models in task-driven dialog systems rely on carefully annotated training data. However, annotations by crowd workers are often inconsistent or contain errors. Simple solutions like manually checking annotations or having multiple workers label each sample are expensive and waste effort on samples that are correct. If we can identify inconsistencies, we can focus effort where it is needed. Toward this end, we define six inconsistency types in slot-filling annotations. Using three new noisy crowd-annotated datasets, we show that a wide range of inconsistencies occur and can impact system performance if not addressed. We then introduce automatic methods of identifying inconsistencies. Experiments on our new datasets show that these methods effectively reveal inconsistencies in data, though there is further scope for improvement.

show abstract

Section: Crowdsourcing Qualitymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods

Larson¹,

Cheung²,

Mahendran³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…We crowdsource advice annotations from Amazon Mechanical Turk. Despite the inherent noise due to crowdsourcing (Parde and Nielsen, 2017), recent work showed that when designed carefully, aggregated crowdsourced annotations are trustworthy even for complex tasks (Nye et al, 2018).…”

Section: Annotation Taskmentioning

confidence: 99%

Help! Need Advice on Identifying Advice

Govindarajan¹,

Chen²,

Warholic³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Humans use language to accomplish a wide variety of tasks -asking for and giving advice being one of them. In online advice forums, advice is mixed in with non-advice, like emotional support, and is sometimes stated explicitly, sometimes implicitly. Understanding the language of advice would equip systems with a better grasp of language pragmatics; practically, the ability to identify advice would drastically increase the efficiency of adviceseeking online, as well as advice-giving in natural language generation systems.We present a dataset in English from two Reddit advice forums -r/AskParents and r/needadvice -annotated for whether sentences in posts contain advice or not. Our analysis reveals rich linguistic phenomena in advice discourse. We present preliminary models showing that while pre-trained language models are able to capture advice better than rulebased systems, advice identification is challenging, and we identify directions for future research.

show abstract

“…We collected gold standard metaphor novelty scores for these word pairs in the same manner by which we built our previous VUAMC-based metaphor novelty dataset (Parde and Nielsen, 2018a), used to train the metaphor novelty prediction model in this work. Specifically, we crowdsourced five annotations for each word pair, and automatically aggregated them to continuous scores using a label aggregation model learned from features based on annotation distribution and presumed worker trustworthiness (Parde and Nielsen, 2017). There were two statistically significant differences between the two groups: questions about false positives were rated as clearer than questions about true positives, and questions about true positives were rated as having more depth than questions about false positives.…”

Section: Average Ratings For Question Subgroupsmentioning

confidence: 99%

Automatically Generating Questions about Novel Metaphors in Literature

Parde

Nielsen

2018

Proceedings of the 11th International Conference on Natural Language Generation

Self Cite

View full text Add to dashboard Cite

The automatic generation of stimulating questions is crucial to the development of intelligent cognitive exercise applications. We developed an approach that generates appropriate Questioning the Author queries based on novel metaphors in diverse syntactic relations in literature. We show that the generated questions are comparable to human-generated questions in terms of naturalness, sensibility, and depth, and score slightly higher than human-generated questions in terms of clarity. We also show that questions generated about novel metaphors are rated as cognitively deeper than questions generated about non-or conventional metaphors, providing evidence that metaphor novelty can be leveraged to promote cognitive exercise.

show abstract

Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data

Cited by 11 publications

References 7 publications

Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods

Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods

Help! Need Advice on Identifying Advice

Automatically Generating Questions about Novel Metaphors in Literature

Contact Info

Product

Resources

About