A text-mining system for extracting metabolic reactions from full-text articles

Czarnecki, Jan; Nobeli, Irene; Smith, Adrian; Shepherd, Adrian J.

doi:10.1186/1471-2105-13-172

Cited by 32 publications

(22 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lastly, Relation Extraction (RE) is a task for extracting pre-defined facts relating to an entity or entities in the text [29]. In biomedical domain, multiple RE methods have been developed to extract information relating to genes [16], such as Mutation-Disease associations, protein-protein interaction [30,31], pathway curation [32], gene methylation and cancer relation [33], biomolecular events [34], metabolic reactions [35] and gene-gene interactions [36]. For gene regulatory networks, which is the focus of this paper, the RE sys-tem must detect and extract a causal relation between a protein and a gene (e.g., A regulated B).…”

Section: Overview and Related Workmentioning

confidence: 99%

ModEx: A text mining system for extracting mode of regulation of Transcription Factor-gene regulatory interaction

Farahmand

Riley

Zarringhalam

2019

Preprint

View full text Add to dashboard Cite

A B S T R A C T Background: Transcription factors (TFs) are proteins that are fundamental to transcription and regulation of gene expression. Each TF may regulate multiple genes and each gene may be regulated by multiple TFs. TFs can act as either activator or repressor of gene expression. This complex network of interactions between TFs and genes underlies many developmental and biological processes and is implicated in several human diseases such as cancer. Hence deciphering the network of TFgene interactions with information on mode of regulation (activation vs. repression) is an important step toward understanding the regulatory pathways that underlie complex traits. There are many experimental, computational, and manually curated databases of TF-gene interactions. In particular, high-throughput ChIP-Seq datasets provide a large-scale map or transcriptional regulatory interactions. However, these interactions are not annotated with information on context and mode of regulation. Such information is crucial to gain a global picture of gene regulatory mechanisms and can aid in developing machine learning models for applications such as biomarker discovery, prediction of response to therapy, and precision medicine. Methods: In this work, we introduce a text-mining system to annotate ChIP-Seq derived interaction with such meta data through mining PubMed articles. We evaluate the performance of our system using gold standard small scale manually curated databases. Results: Our results show that the method is able to accurately extract mode of regulation with F-score 0.77 on TRRUST curated interaction and F-score 0.96 on intersection of TRUSST and ChIP-network. We provide a HTTP REST API for our code to facilitate usage. Availibility: Source code and datasets are available for download on GitHub: https:

show abstract

Section: Overview and Related Workmentioning

confidence: 99%

ModEx: A text mining system for extracting mode of regulation of Transcription Factor-gene regulatory interaction

Farahmand

Riley

Zarringhalam

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Knowledge discovery uses techniques from a wide range of disciplines such as artificial intelligence, machine learning, pattern recognition, data mining, and statistics [45]. Both information extraction and knowledge discovery find their application in database curation [46], [47] and pathway construction [48], [49].…”

Section: E Biomedical Text Mining Tasksmentioning

confidence: 99%

A Review of Towered Big-Data Service Model for Biomedical Text-Mining Databases

Abed¹,

Yuan²,

Li³

2017

ijacsa

View full text Add to dashboard Cite

Abstract-The rapid growth of biomedical informatics has drawn increasing popularity and attention. The reason behind this are the advances in genomic, new molecular, biomedical approaches and various applications like protein identification, patient medical records, genome sequencing, medical imaging and a huge set of biomedical research data are being generated day to day. The increase of biomedical data consists of both structured and unstructured data. Subsequently, in a traditional database system (structured data), managing and extracting useful information from unstructured-biomedical data is a tedious job. Hence, mechanisms, tools, processes, and methods are necessary to apply on unstructured biomedical data (text) to get the useful business data. The fast development of these accumulations makes it progressively troublesome for people to get to the required information in an advantageous and viable way. Text mining can help us mine information and knowledge from a mountain of text, and is now widely applied in biomedical research. Text mining is not a new technology, but it has recently received spotlight attention due to the emergence of Big Data. The applications of text mining are diverse and span to multiple disciplines, ranging from biomedicine to legal, business intelligence and security. In this survey paper, the researcher identifies and discusses biomedical data (text) mining issues, and recommends a possible technique to cope with possible future growth.

show abstract

“…The NutriChem database [62] has been developed using a similar approach to find plant and diet related compounds from PubMed. Metabolomics text mining was used to extract information on all literature-known compounds in yeast [63], and to complement pathway reconstructions through reports on product/substrate pairs [64]. However, there has been little progress in using automated text mining approaches for the complement of metabolomics data sets, with the sole exception of PolySearch [65] ●● .…”

Section: Introductionmentioning

confidence: 99%

Integrating bioinformatics approaches for a comprehensive interpretation of metabolomics datasets

Barupal

Fan

Fiehn

2018

Current Opinion in Biotechnology

View full text Add to dashboard Cite

Access to high quality metabolomics data has become a routine component for biological studies. However, interpreting those datasets in biological contexts remains a challenge, especially because many identified metabolites are not found in biochemical pathway databases. Starting from statistical analyses, a range of new tools are available, including metabolite set enrichment analysis, pathway and network visualization, pathway prediction, biochemical databases and text mining. Integrating these approaches into comprehensive and unbiased interpretations must carefully consider both caveats of the metabolomics dataset itself as well as the structure and properties of the biological study design. Special considerations need to be taken when adopting approaches from genomics for use in metabolomics. R and Python programming language are enabling an easier exchange of diverse tools to deploy integrated workflows. This review summarizes the key ideas and latest developments in regards to these approaches.

show abstract

A text-mining system for extracting metabolic reactions from full-text articles

Cited by 32 publications

References 37 publications

ModEx: A text mining system for extracting mode of regulation of Transcription Factor-gene regulatory interaction

ModEx: A text mining system for extracting mode of regulation of Transcription Factor-gene regulatory interaction

A Review of Towered Big-Data Service Model for Biomedical Text-Mining Databases

Integrating bioinformatics approaches for a comprehensive interpretation of metabolomics datasets

Contact Info

Product

Resources

About