2016
DOI: 10.1371/journal.pone.0162721
|View full text |Cite
|
Sign up to set email alerts
|

SparkText: Biomedical Text Mining on Big Data Framework

Abstract: BackgroundMany new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment.ResultsIn this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is compo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0
2

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(18 citation statements)
references
References 19 publications
0
16
0
2
Order By: Relevance
“…Finally, in order to predict document's subject area based on its abstract, text mining-based classification is used [46]. For this purposes, binary logistic regression is selected as a prediction model.…”
Section: Methodsmentioning
confidence: 99%
“…Finally, in order to predict document's subject area based on its abstract, text mining-based classification is used [46]. For this purposes, binary logistic regression is selected as a prediction model.…”
Section: Methodsmentioning
confidence: 99%
“…For example, iHOP uses a text mining approach wherein genes and proteins are used as hyperlinks between sentences and PubMed abstracts and then uses the textmined information to produce network representations that users can browse [36]. Other tools include Twister that is aimed at reducing the screening time of systematic literature reviews [37]; SWIFT-Review, which is a workbench for systematic review based on NLP [38]; SparkText, which is a big data framework for mining biomedical literature [39]; and GIS, which is an NLP-based framework for gene discovery from scientific literature [40]. In addition to these tools, several frameworks for mining biomedical literature have been developed [41][42][43][44][45][46][47].…”
Section: Exploring Voluminous Informationmentioning
confidence: 99%
“…23 Ye et al used support vector machines to predict cancer type in full-text articles. 24 ES is defined as how much you should think of a disorder given a finding or manifestation. This embodies a sense of how important it is not to miss this disorder.…”
Section: Background and Significancementioning
confidence: 99%