2018
DOI: 10.1101/265918
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Crowdsourcing Image Analysis for Plant Phenomics to Generate Ground Truth Data for Machine Learning

Abstract: The accuracy of machine learning tasks is critically dependent on high quality ground truth data. Therefore, in many cases, producing good ground truth data typically involves trained professionals; however, this can be costly in time, effort, and money. Here we explore the use of crowdsourcing to generate a large number of training data points of good quality. We explore an image analysis task involving the segmentation of corn tassels from images taken in a field setting. We explore the accuracy, speed and o… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 27 publications
0
14
0
Order By: Relevance
“…Now, the millions of images of herbarium specimens available make a wider diversity of phenotypes freely accessible, an accomplishment soon to be followed by zoological collections. In turn, easy access to specimen images supports data collection with crowdsourcing, a relatively fast and cheap way to obtain high‐quality data (Chang & Alfaro, 2016; O'Leary et al, 2018; Zhou et al., 2018). Moreover, crowdsourced datasets may also aid the improvement of automatic extraction methods (Burleigh et al., 2013; Zhou et al., 2018), or be used in parallel with them to improve data collection.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Now, the millions of images of herbarium specimens available make a wider diversity of phenotypes freely accessible, an accomplishment soon to be followed by zoological collections. In turn, easy access to specimen images supports data collection with crowdsourcing, a relatively fast and cheap way to obtain high‐quality data (Chang & Alfaro, 2016; O'Leary et al, 2018; Zhou et al., 2018). Moreover, crowdsourced datasets may also aid the improvement of automatic extraction methods (Burleigh et al., 2013; Zhou et al., 2018), or be used in parallel with them to improve data collection.…”
Section: Discussionmentioning
confidence: 99%
“…In turn, easy access to specimen images supports data collection with crowdsourcing, a relatively fast and cheap way to obtain high‐quality data (Chang & Alfaro, 2016; O'Leary et al, 2018; Zhou et al., 2018). Moreover, crowdsourced datasets may also aid the improvement of automatic extraction methods (Burleigh et al., 2013; Zhou et al., 2018), or be used in parallel with them to improve data collection. For example, while machines still cannot differentiate overlapped plant organs (Gehan & Kellogg, 2017), an automatic approach can be complemented by human input, as we can easily identify and overcome overlapping issues.…”
Section: Discussionmentioning
confidence: 99%
“…For example, CrowdCurio-Thoreau's Field Notes, an online crowdsourcing platform has successfully facilitated climate change studies from thousands of herbarium specimens utilizing thousands of non-expert crowdsourcers (Willis et al 2017). Quality control is always a concern in large-scale citizen science projects (Willis et al 2017, Zhou et al 2018 and thus an easy-to-use graphical user interface clearly demonstrating to the public how and what to digitize will be necessary (e.g., Notes for Nature), as has been accomplished in several research-based projects (Chang and Alfaro 2016, Cooney et al 2017, Willis et al 2017. Increasingly, such citizen science efforts are being supplemented by machine-based learning as well (Unger et al 2016, Wilf et al 2016, Schuettpelz et al 2017.…”
Section: Expansion Of the Digitization Workforce-expanding Digitizatimentioning
confidence: 99%
“…Recent work has indicated that carefully crowdsourced annotations can expedite data processing tasks that have a visual component. Volunteer-based citizen science has made substantial contributions to areas of biology from proteomics (24)(25)(26) to ecology (27) . When tasks are less intrinsically interesting to volunteers, minimally-trained workers can complete tasks for small payments through crowdsourcing platforms such as Amazon's Mechanical Turk (MTurk), and the consensus annotations (across multiple workers or "turkers") can be highly comparable with expert annotations, and sufficiently reliable for use as training data for detection algorithms.…”
Section: Introductionmentioning
confidence: 99%
“…When tasks are less intrinsically interesting to volunteers, minimally-trained workers can complete tasks for small payments through crowdsourcing platforms such as Amazon's Mechanical Turk (MTurk), and the consensus annotations (across multiple workers or "turkers") can be highly comparable with expert annotations, and sufficiently reliable for use as training data for detection algorithms. (27)(28)(29) Therefore, we hypothesized that consensus from crowdsourced annotations can be used as a substitute for ground truth to tune and benchmark spot-calling algorithms. However, there are no published in situ transcriptomics pipelines that can incorporate ground truth from crowdsourced annotations.…”
Section: Introductionmentioning
confidence: 99%