2016
DOI: 10.1371/journal.pone.0152725
|View full text |Cite
|
Sign up to set email alerts
|

DiMeX: A Text Mining System for Mutation-Disease Association Extraction

Abstract: The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input tex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
40
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 52 publications
(40 citation statements)
references
References 42 publications
(51 reference statements)
0
40
0
Order By: Relevance
“…Several efforts addressing information extraction and text classification tasks benefit from an integrated approach. For instance, Botsis et al, developed a common decision support environment for medical product safety surveillance [73], while others used NLP to build databases of clinically useful information, such as mutation-disease associations [74], drug side-effects by parsing product labels [75], and genetic alteration information in cancer trials [76]. This includes work on annotated text corpora, e.g.…”
Section: Text Classification and Information Extraction Remain Strongmentioning
confidence: 99%
“…Several efforts addressing information extraction and text classification tasks benefit from an integrated approach. For instance, Botsis et al, developed a common decision support environment for medical product safety surveillance [73], while others used NLP to build databases of clinically useful information, such as mutation-disease associations [74], drug side-effects by parsing product labels [75], and genetic alteration information in cancer trials [76]. This includes work on annotated text corpora, e.g.…”
Section: Text Classification and Information Extraction Remain Strongmentioning
confidence: 99%
“…Therefore, diagnostic systems (Chen et al, 2018) have become more relevant and researchers such as Xia et al attempt to take on the challenge through the mining of information from sources such as DO, Symptom Ontology (SYMP) and MEDLINE/PubMed citation records (Xia et al, 2018). We can also observe in the literature a large volume of studies that use the mining of texts from different unstructured or semi-structured medical information sources (Frunza, Inkpen & Tran, 2011;Mazumder et al, 2016;Singhal, Simmons & Lu, 2016;Xu et al, 2016;Tsumoto et al, 2017;Sudeshna, Bhanumathi & Hamlin, 2017;Aich et al, 2017;Gupta et al, 2018;Rao & Rao, 2018;Zhao et al, 2018;Bou Rjeily et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…The large amount of data generated from these studies [4] necessitates the need to develop an automatic approach in order to facilitate the study of the extracted associations. Recently, a few corpora and methods have been developed with the aim of extracting mutation and disease associations from texts such as [5] and [6]. There is, on the other hand, no available corpus for extracting the association of SNP-phenotypes from texts annotated with negation, modality, and the confidence degree of such associations.…”
Section: Introductionmentioning
confidence: 99%
“…PKDE4J [5] and Dimex [6] are two methods for extracting mutation and disease association, the latter being a rule-based unsupervised mutation-disease association extraction working on the abstract level. The PKDE4J, however, is a supervised method that employs a rich set of rules to detect the used features.…”
Section: Introductionmentioning
confidence: 99%