Expecting the Unexpected: Effects of Data Collection Design Choices on the Quality of Crowdsourced User-Generated Content

Lukyanenko, Roman; Parsons, Jeffrey; Wiersma, Yolanda F.; Maddah, Mahed

doi:10.25300/misq/2019/14439

Cited by 63 publications

(39 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…For example, participants who know the objective of the project may overinflate or exaggerate information (Galloway et al 2006;Miller et al 2012). Some have explored the use of tasks which specifically do not require training (Eveleigh et al 2014;Lukyanenko et al 2019), while others have investigated the possibility of attracting diverse crowds so that training induced biases are mitigated due to the diversity of the participants (Ogunseye et al 2017;Ogunseye and Parsons 2016). All these are novel ideas for traditional information quality research.…”

Section: Training and Evaluating Learning And Performancementioning

confidence: 99%

“…Thus, holding data contributors to data consumer standards may curtail their ability to provide high quality content as defined by data consumers, or suggest a need to refine instructions, procedures, or expectations. Guided by this definition, a series of laboratory and fields experiments demonstrated that accuracy and completeness of citizen science data can indeed be improved by relaxing the requirements to comply with data consumer needs (Lukyanenko et al 2014a(Lukyanenko et al , 2014bLukyanenko et al 2019), which is a standard project design strategy for achieving data quality targets in citizen science. Further, for projects focusing on new, emerging phenomena, it may be challenging to anticipate the optimal structure of citizen science data due to the different needs of diverse data consumers, so traditional solutions for storage premised on a priori structures (e.g., relational databases) may be inadequate in this setting (Sheppard et al 2014).…”

Section: Re-examining Is Research Assumptionsmentioning

confidence: 99%

See 1 more Smart Citation

Citizen Science: An Information Quality Research Frontier

2019

Self Cite

View full text Add to dashboard Cite

The rapid proliferation of online content producing and sharing technologies resulted in an explosion of user-generated content (UGC), which now extends to scientific data. Citizen science, in which ordinary people contribute information for scientific research, epitomizes UGC. Citizen science projects are typically open to everyone, engage diverse audiences, and challenge ordinary people to produce data of highest quality to be usable in science. This also makes citizen science a very exciting area to study both traditional and innovative approaches to information quality management. With this paper we position citizen science as a leading information quality research frontier. We also show how citizen science opens a unique opportunity for the information systems community to contribute to a broad range of disciplines in natural and social sciences and humanities.

show abstract

Section: Training and Evaluating Learning And Performancementioning

confidence: 99%

Section: Re-examining Is Research Assumptionsmentioning

confidence: 99%

Citizen Science: An Information Quality Research Frontier

2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Lukyanenko et al ( 2019a , p. 7) state that despite the importance for the society and the relatedness to our discipline IS “[…] continues to lag behind such disciplines as biology and education in working with citizen science as a context for research.” Disciplines like biology, conservation, and physics are much more active here (demonstrated clearly by Lukyanenko et al 2019b ). Although there are some academic articles in IS literature on citizen science there is still a lack of citizen science projects with clear IS research questions.…”

Section: Citizen Sciencementioning

confidence: 99%

Citizen Science in Information Systems Research

et al. 2020

View full text Add to dashboard Cite

Science communication is becoming increasingly important, also in information systems (IS) research, as it is increasingly demanded when applying for research funding or academic positions. In 2019, e.g., the German Federal Ministry of Education and Research published a keynote paper highlighting the increasing importance and necessity of appropriate science communication to receive funding (BMBF 2019). Similar requirements have certainly been formulated by many other institutions that provide research funding. According to Lewenstein (2016, p. 1), ''Citizen science is one of the most dramatic developments in science communication in the last generation.'' Citizen Science, the (large-scale) involvement of citizens in scientific endeavors not only as participants but as co-researchers, is an extreme, but potentially promising, approach to close

show abstract

“…Such a technique might be used as an automatic reconciliation system that treats every new contribution of sets of attributes as raw data and, simultaneously, as training data for an instance. A recent study, for example, demonstrates the potential of machine learning classification by classifying finegrained crowdsourced data into more useful coarsegrained data with reasonable accuracy [47]. Further explorations of how to use similar artificial intelligence tools to enhance the utility of crowdsourced data is a potent area for future research.…”

Section: Future Directionsmentioning

confidence: 99%

Capturing the Forest or the Trees: Designing for Granularity in Data Crowdsourcing

Murphy¹,

Parsons²

2020

Proceedings of the Annual Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

Crowdsourcing is a method of completing a task by engaging a large group of heterogeneous contributors. Data crowdsourcing is crowdsourcing of data collection. In this paper, we demonstrate how data crowdsourcing projects can be differentiated along five dimensions: (1) the extent to which tasks are well-defined; (2) the duration of the task; (3) the type of value generated by the consumers of crowdsourcing data; (4) the variety of contribution allowed when completing the task; and (5) the relative value of each contribution. We argue that the quality of information created by a crowd depends on the granularity of contributions contributors are able to make. Finally, we propose a set of principles for designing crowdsourcing system to align the level of granularity of contributions with project objectives.

show abstract

Expecting the Unexpected: Effects of Data Collection Design Choices on the Quality of Crowdsourced User-Generated Content

Cited by 63 publications

References 27 publications

Citizen Science: An Information Quality Research Frontier

Citizen Science: An Information Quality Research Frontier

Citizen Science in Information Systems Research

Capturing the Forest or the Trees: Designing for Granularity in Data Crowdsourcing

Contact Info

Product

Resources

About