Increasing cheat robustness of crowdsourcing tasks

Eickhoff, Carsten; Vries, Arjen P. de

doi:10.1007/s10791-011-9181-9

Cited by 135 publications

(117 citation statements)

References 18 publications

(18 reference statements)

Supporting

Mentioning

114

Contrasting

Unclassified

Order By: Relevance

“…However crowd-sourcing is not a perfect system, issues such as quality control and reliability of results need further investigation. Automatic quality control methods, as the gold units used in this study, can be a good option since data cleaning is a time-consuming and tedious task [14]. However, they do not always ensure high quality data; since answers can only be rejected at runtime, they may attract more spammers, malicious, and sloppy workers [12] [14].…”

Section: Validation Of Crowd-sourcing Analysismentioning

confidence: 99%

A Hybrid Machine-Crowd Approach to Photo Retrieval Result Diversification

Radu

Ionescu

Menéndez

et al. 2014

MultiMedia Modeling

View full text Add to dashboard Cite

Abstract. In this paper we address the issue of optimizing the actual social photo retrieval technology in terms of users' requirements. Typical users are interested in taking possession of accurately relevant-to-the-query and non-redundant images so they can build a correct exhaustive perception over the query. We propose to tackle this issue by combining two approaches previously considered nonoverlapping: machine image analysis for a pre-filtering of the initial query results followed by crowd-sourcing for a final refinement. In this mechanism, the machine part plays the role of reducing the time and resource consumption allowing better crowd-sourcing results. The machine technique ensures representativeness in images by performing a re-ranking of all images according to the most common image in the initial noisy set; additionally, diversity is ensured by clustering the images and selecting the best ranked images among the most representative in each cluster. Further, the crowd-sourcing part enforces both representativeness and diversity in images, objectives that are, to a certain extent, out of reach by solely the automated machine technique. The mechanism was validated on more than 25,000 photos retrieved from several common social media platforms, proving the efficiency of this approach.

show abstract

Section: Validation Of Crowd-sourcing Analysismentioning

confidence: 99%

A Hybrid Machine-Crowd Approach to Photo Retrieval Result Diversification

Radu

Ionescu

Menéndez

et al. 2014

MultiMedia Modeling

View full text Add to dashboard Cite

show abstract

“…В работах [68,69] делается попытка выделения причин получения некачественных результатов в крауд-вычислениях. Среди основных факторов, влияющих на качество, названы компетентность, заинтересованность в выполнении заданий (мотивация), ясность представления заданий, наличие или отсутствие «злого умысла» (целенаправленного подрыва работы системы).…”

Section: теоретико-игровые методыunclassified

Quality Control Methods in Crowd Computing: Literature Review

Пономарев¹

2017

Тр. СПИИРАН

View full text Add to dashboard Cite

А.В. ПОНОМАРЕВ МЕТОДЫ ОБЕСПЕЧЕНИЯ КАЧЕСТВА РЕЗУЛЬТАТОВ В СИСТЕМАХ КРАУД-ВЫЧИСЛЕНИЙ: АНАЛИТИЧЕСКИЙ ОБЗОРПономарев А.В. Методы обеспечения качества результатов в системах крауд-вычислений: аналитический обзор. Аннотация. В настоящее время все большее распространение получают крауд-вычисления, то есть привлечение к задачам обработки информации широкого круга людей, взаимодействующих посредством инфокоммуникационных технологий. Тем не менее практическое применение этой технологии в значительной мере сдерживается неопределенностью качества получаемых результатов. В этих условиях задача систематизации сведений об используемых на данный момент методах обеспечения качества и идентификации перспективных направлений их развития является особенно актуальной. В статье обсуждаются результаты систематического обзора журнальных публикаций полнотекстовых баз ScienceDirect и IEEE Xplore, вышедших после 2012 года. Выделены наиболее распространенные на данный момент направления в обеспечении качества, используемые модели и принимаемые допущения, обозначены границы применимости методов. Отмечено, что наибольшее распространение получили методы, основанные на согласовании оценок, полученных от разных участников, и методы, основанные на применении теоретико-игровых моделей.Ключевые слова: крауд-вычисления, человеко-машинные вычисления, краудсорсинг, социальные вычисления, человеко-машинные системы, человеческие факторы, систематический обзор литературы.

show abstract

“…A commonly used solution is to employ a worker reputation system with assigning tasks to workers with approval ratings above a certain pre-set level [33]. Another set of methods of identification and expulsion of the unethical workers is based on a set of indices measuring (1) agreement with the expert "golden standard" data; (2) agreement with the other workers; (3) agreement with the attention check questions and (4) an amount of effort estimated from the task completion time [34]. The "golden standard" is a subset of data that is processed by experts in the field; an important condition is that a lay person should be able to process this data easily and unambitiously.…”

mentioning

confidence: 99%

“…The attention check and language comprehension questions are verifiable questions [29] that do not require factual knowledge [36]; the results obtained from the workers failing to answer the attention questions correctly should be discarded. Finally, the average time to complete a single task is used to identify low-quality workers presumably spending a lesser amount of time per task [34].…”

mentioning

confidence: 99%

See 1 more Smart Citation

Crowdsourcing Analysis of Twitter Data on Climate Change: Paid Workers vs. Volunteers

et al. 2017

View full text Add to dashboard Cite

Web based crowdsourcing has become an important method of environmental data processing. Two alternatives are widely used today by researchers in various fields: paid data processing mediated by for-profit businesses such as Amazon's Mechanical Turk, and volunteer data processing conducted by amateur citizen-scientists. While the first option delivers results much faster, it is not quite clear how it compares with volunteer processing in terms of quality. This study compares volunteer and paid processing of social media data originating from climate change discussions on Twitter. The same sample of Twitter messages discussing climate change was offered for processing to the volunteer workers through the Climate Tweet project, and to the paid workers through the Amazon MTurk platform. We found that paid crowdsourcing required the employment of a high redundancy data processing design to obtain quality that was comparable with volunteered processing. Among the methods applied to improve data processing accuracy, limiting the geographical locations of the paid workers appeared the most productive. Conversely, we did not find significant geographical differences in the accuracy of data processed by volunteer workers. We suggest that the main driver of the found pattern is the differences in familiarity of the paid workers with the research topic.

show abstract

Increasing cheat robustness of crowdsourcing tasks

Cited by 135 publications

References 18 publications

A Hybrid Machine-Crowd Approach to Photo Retrieval Result Diversification

A Hybrid Machine-Crowd Approach to Photo Retrieval Result Diversification

Quality Control Methods in Crowd Computing: Literature Review

Crowdsourcing Analysis of Twitter Data on Climate Change: Paid Workers vs. Volunteers

Contact Info

Product

Resources

About