2012
DOI: 10.1371/journal.pone.0038460
|View full text |Cite
|
Sign up to set email alerts
|

SR4GN: A Species Recognition Software Tool for Gene Normalization

Abstract: As suggested in recent studies, species recognition and disambiguation is one of the most critical and challenging steps in many downstream text-mining applications such as the gene normalization task and protein-protein interaction extraction. We report SR4GN: an open source tool for species recognition and disambiguation in biomedical text. In addition to the species detection function in existing tools, SR4GN is optimized for the Gene Normalization task. As such it is developed to link detected species with… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
64
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3
3

Relationship

4
6

Authors

Journals

citations
Cited by 75 publications
(64 citation statements)
references
References 19 publications
0
64
0
Order By: Relevance
“…Next, gene name normalization assigns a unique gene identifier for ambiguous gene mentions in text. This step is performed with GenNorm (Wei et al, 2012), which achieved state-of-the-art performance in the BioCreative III challenge (Arighi et al, 2011). Finally, relations or events are extracted between these genes and proteins, detecting a variety of different event types ranging from phosphorylation and ubiquitination to PPIs and regulatory associations.…”
Section: Text Mining Methodologymentioning
confidence: 99%
“…Next, gene name normalization assigns a unique gene identifier for ambiguous gene mentions in text. This step is performed with GenNorm (Wei et al, 2012), which achieved state-of-the-art performance in the BioCreative III challenge (Arighi et al, 2011). Finally, relations or events are extracted between these genes and proteins, detecting a variety of different event types ranging from phosphorylation and ubiquitination to PPIs and regulatory associations.…”
Section: Text Mining Methodologymentioning
confidence: 99%
“…It is based on conditional random fields and identifies many types of mutations and sequence variants in protein, gene, DNA and RNA levels for biomedical curation. This tool was developed in Perl and uses the CRF++ module developed in C++. SR4GN (16) is a species recognition tool optimized for the gene normalization task. It is a rule-based system that identifies species from full-texts and pairs them with corresponding gene or protein mentions.…”
Section: Concept Recognition Toolsmentioning
confidence: 99%
“…Similar methods have been applied to disease names (Doğan & Lu, 2012b;Kang et al, 2012;Névéol et al, 2009) and species names (Gerner et al, 2010;Wei et al, 2012b), and the MetaMap program is used to locate and identify concepts from the UMLS MetaThesaurus (Aronson, 2001;Bodenreider, 2004).…”
Section: Introductionmentioning
confidence: 99%