2021
DOI: 10.2196/25714
|View full text |Cite
|
Sign up to set email alerts
|

Using a Secure, Continually Updating, Web Source Processing Pipeline to Support the Real-Time Data Synthesis and Analysis of Scientific Literature: Development and Validation Study

Abstract: Background The scale and quality of the global scientific response to the COVID-19 pandemic have unquestionably saved lives. However, the COVID-19 pandemic has also triggered an unprecedented “infodemic”; the velocity and volume of data production have overwhelmed many key stakeholders such as clinicians and policy makers, as they have been unable to process structured and unstructured data for evidence-based decision making. Solutions that aim to alleviate this data synthesis–related challenge are… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Eleven papers used a plethora of text-mining tools to aid search syntax building, such as Anne O’Tate, AntConc, Apache Lucene, BiblioShiny, Carrot2, CitNetExplorer, EndNote, Keyword‐Analyzer, Leximancer, Lingo3G, Lingo4G, MeSH on Demand, MetaMap, Microsoft Academic, PubReMiner, Systematic Review Accelerator, TerMine, Text Analyzer, Tm for R, VOSviewer, Voyant, Yale MeSH Analyzer, and in-house solutions [ 18 , 33 35 , 37 , 41 , 46 , 47 , 49 51 ]. Two papers introduced curated article collections, such as Cochrane CENTRAL [ 44 ], and the Realtime Data Synthesis and Analysis (REDASA) COVID-19 dataset [ 48 ], which were assembled using various automation techniques. Other tools included an automated extension of PubMed searches to the ClinicalTrials.gov database [ 40 ], a Boolean query refiner [ 42 ], a support vector machine (SVM) classifier as alternative to PubMed search filters for review updating [ 38 ], a strategy using the Patient, Intervention, Comparator, and Outcome framework (PICO) terms in the title field only [ 39 ], an automated full-text retrieval and targeted search replacing database screening [ 45 ], and a Microsoft Excel-based convenience tool to build Boolean queries [ 43 ].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Eleven papers used a plethora of text-mining tools to aid search syntax building, such as Anne O’Tate, AntConc, Apache Lucene, BiblioShiny, Carrot2, CitNetExplorer, EndNote, Keyword‐Analyzer, Leximancer, Lingo3G, Lingo4G, MeSH on Demand, MetaMap, Microsoft Academic, PubReMiner, Systematic Review Accelerator, TerMine, Text Analyzer, Tm for R, VOSviewer, Voyant, Yale MeSH Analyzer, and in-house solutions [ 18 , 33 35 , 37 , 41 , 46 , 47 , 49 51 ]. Two papers introduced curated article collections, such as Cochrane CENTRAL [ 44 ], and the Realtime Data Synthesis and Analysis (REDASA) COVID-19 dataset [ 48 ], which were assembled using various automation techniques. Other tools included an automated extension of PubMed searches to the ClinicalTrials.gov database [ 40 ], a Boolean query refiner [ 42 ], a support vector machine (SVM) classifier as alternative to PubMed search filters for review updating [ 38 ], a strategy using the Patient, Intervention, Comparator, and Outcome framework (PICO) terms in the title field only [ 39 ], an automated full-text retrieval and targeted search replacing database screening [ 45 ], and a Microsoft Excel-based convenience tool to build Boolean queries [ 43 ].…”
Section: Resultsmentioning
confidence: 99%
“…Further approaches included PECO tagging in a rapid evidence mapping study using SWIFT Review [ 125 ], extraction of geographic locations from the manuscript [ 141 ], extraction of endpoints as comparative claim sentences [ 142 ], data extraction from ClinicalTrials.gov for meta-analyses [ 143 ], and convenience tools to highlight relevant sentences [ 74 ], or extract data from graphs [ 144 ]. Finally, development of the REDASA COVID-19 dataset involved human experts in the loop, web-crawling, and a natural language processing search engine to provide a real-time curated open dataset for evidence syntheses to aid pandemic response [ 48 ].…”
Section: Resultsmentioning
confidence: 99%
“…Nineteen included papers (15.4%) aimed to automate or improve database searches 18,43,51,54,56,61,70,76,78,98,101,104,111,113,115,122,127,136,140 . The rst included paper from 2011 applied text-mining to construct a search syntax for PubMed, using the Apache Lucene platform 43 .…”
Section: Searchmentioning
confidence: 99%
“…Further approaches included PECO tagging in a rapid evidence mapping study using SWIFT Review 91 , extraction of geographic locations from the manuscript 133 , extraction of endpoints as comparative claim sentences 55 , data extraction from ClinicalTrials.gov for meta-analyses 97 , and convenience tools to highlight relevant sentences 130 , or extract data from graphs 88 . Finally, development of the REDASA Covid-19 dataset involved human experts in the loop, web-crawling and a natural language processing search engine to provide a real-time curated open dataset for evidence syntheses to aid pandemic response 127 .…”
Section: Data Extractionmentioning
confidence: 99%
See 1 more Smart Citation