2021
DOI: 10.1186/s12874-021-01271-4
|View full text |Cite
|
Sign up to set email alerts
|

Crowdsourcing citation-screening in a mixed-studies systematic review: a feasibility study

Abstract: Background Crowdsourcing engages the help of large numbers of people in tasks, activities or projects, usually via the internet. One application of crowdsourcing is the screening of citations for inclusion in a systematic review. There is evidence that a ‘Crowd’ of non-specialists can reliably identify quantitative studies, such as randomized controlled trials, through the assessment of study titles and abstracts. In this feasibility study, we investigated crowd performance of an online, topic-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
13
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(23 citation statements)
references
References 29 publications
(35 reference statements)
2
13
0
Order By: Relevance
“…For both ML algorithms, we will determine the threshold at which the sensitivity is >95% when used in combination with a single human reviewer. This approach is consistent with the individual sensitivity of expert reviewers, as described in previous studies [ 20 , 23 , 32 , 33 ].…”
Section: Methodssupporting
confidence: 88%
“…For both ML algorithms, we will determine the threshold at which the sensitivity is >95% when used in combination with a single human reviewer. This approach is consistent with the individual sensitivity of expert reviewers, as described in previous studies [ 20 , 23 , 32 , 33 ].…”
Section: Methodssupporting
confidence: 88%
“…Despite crowd sensitivity not achieving 100% for three of the four reviews used in this evaluation study, sensitivity was comparable to other similar studies run by this and other research teams 9 , 10 , 11 , 12 and potentially more accurate than having the search results screened by a single human assessor. 24 However, it is arguable that providing a measure of sensitivity where the prevalence of included studies within each of the review datasets was very low, should be considered with caution: Review 1 had a prevalence of 0.87%, Review 2: 1.07%, Review 3: 0.53%, Review 4: 2%.…”
Section: Discussionsupporting
confidence: 70%
“…In terms of the agreement algorithm, we chose an algorithm (three consecutive agreements) that had produced high collective accuracy in other similar pilot projects. 9 , 10 Would altering the consecutive number of agreeing classifications have made a difference to collective accuracy? Starting with the accuracy of a single classification, the mean accuracy of individual contributors for each review was: 84.2% sensitivity and 82.2% specificity for Review 1; 86.6% sensitivity, 84.1% specificity for Review 2; 85.1% sensitivity, 89.9% specificity for Review 3; and 89.3% sensitivity, 90.9% specificity for Review 4.…”
Section: Discussionmentioning
confidence: 99%
“…We successfully incorporated crowdsourcing and ML to execute a large scoping review in an efficient timeline without hindering sensitivity. We used targeted recruitment techniques to assemble an international and multidisciplinary team with a performance (sensitivity mean [ sd ], 0.92 [0.9]) consistent with or better than other studies evaluating single and crowdsourced reviewers (22, 23, 95, 96). We used a rigorous data-driven approach to conservatively integrate a ML algorithm into citation screening, which missed none of the included studies and significantly improved review efficiency.…”
Section: Discussionmentioning
confidence: 99%