Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 2010
DOI: 10.1145/1753326.1753688
|View full text |Cite
|
Sign up to set email alerts
|

Are your participants gaming the system?

Abstract: In this paper we discuss a screening process used in conjunction with a survey administered via Amazon.com's Mechanical Turk. We sought an easily implementable method to disqualify those people who participate but don't take the study tasks seriously. By using two previously pilot tested screening questions, we identified 764 of 1,962 people who did not answer conscientiously. Young men seem to be most likely to fail the qualification task. Those that are professionals, students, and non-workers seem to be mor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

9
139
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 311 publications
(148 citation statements)
references
References 3 publications
9
139
0
Order By: Relevance
“…Experiments 3, 4, and 5 had similar failure rates (38 %, 45 %, and 40 %, respectively). These rates are above or at the high end of those reported in research investigating Mechanical Turk as a subject pool (10 %-39 %; Downs, Holbrook, Sheng, & Cranor, 2010;Goodman, Cryder, & Cheema, 2013;Kapelner & Chandler, 2010). We suspect that our high failure rates are an artifact of the attention check we used.…”
Section: Resultsmentioning
confidence: 60%
See 1 more Smart Citation
“…Experiments 3, 4, and 5 had similar failure rates (38 %, 45 %, and 40 %, respectively). These rates are above or at the high end of those reported in research investigating Mechanical Turk as a subject pool (10 %-39 %; Downs, Holbrook, Sheng, & Cranor, 2010;Goodman, Cryder, & Cheema, 2013;Kapelner & Chandler, 2010). We suspect that our high failure rates are an artifact of the attention check we used.…”
Section: Resultsmentioning
confidence: 60%
“…We suspect that our high failure rates are an artifact of the attention check we used. The article that subjects read was six paragraphs long and came at the end of the experiment, when subjects would be most fatigued and tempted to skim or skip material (see Downs et al, 2010). Moreover, the effort involved in the reading task was different from that of judging the truth of claims.…”
Section: Resultsmentioning
confidence: 99%
“…Another potential benefit of improvements to task routing and assignment, is that it might mitigate the need for error controlling approaches that are currently predominantly used in crowdsourcing, such as the Gold Standard [10]. This can be quite important in certain situations, as authoring gold data can be burdensome, and gold standards are challenging to implement in subjective or generative tasks, like writing an essay [31].…”
Section: Benefits Of Appropriate Task Routing and Assignment In Crowdmentioning
confidence: 99%
“…Currently, quality control is typically implemented as posthoc filtering of substandard answers and "smart" aggregation of the crowd contributions. A common technique is to adopt a Gold Standard [10], which entails the creation and inclusion of tasks that have known answers to the requested crowdsourcing job. Another approach is to analyse the extent to which workers agree with each other in their answers [4,17,28].…”
Section: Introductionmentioning
confidence: 99%
“…In some implementations, training results in a pass/fail that screens out untrustworthy or inaccurate users (Downs et al 2010, Le et al 2010. In others, the score attributed to each user's vote is weighted based upon how well they perform during the training (Sheng et al 2014).…”
Section: Improving the Voting Systemmentioning
confidence: 99%