2009
DOI: 10.1186/1471-2105-10-233
|View full text |Cite
|
Sign up to set email alerts
|

A realistic assessment of methods for extracting gene/protein interactions from free text

Abstract: Background: The automated extraction of gene and/or protein interactions from the literature is one of the most important targets of biomedical text mining research. In this paper we present a realistic evaluation of gene/protein interaction mining relevant to potential non-specialist users. Hence we have specifically avoided methods that are complex to install or require reimplementation, and we coupled our chosen extraction methods with a state-of-the-art biomedical named entity tagger.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 53 publications
(49 citation statements)
references
References 23 publications
(35 reference statements)
0
49
0
Order By: Relevance
“…In RelEx, the lemmatization of 175 words is presented, and all the words are captured except for ''ligand'' and ''use'' when lemmatizing the word in our ontology, where ''ligand'' as a noun without corresponding verb and ''use'' as stop words are ignored. The words used in Temkin (Temkin and Gilder, 2003) and Kabiljo (Kabiljo et al, 2009) are all covered by our ontology. As the discussed hierarchy in WordNet, the meaning of the lower-level word in the relation ontology is more general than that of the corresponding higher-level one, i.e., the lower the level where the …”
Section: Coverage Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…In RelEx, the lemmatization of 175 words is presented, and all the words are captured except for ''ligand'' and ''use'' when lemmatizing the word in our ontology, where ''ligand'' as a noun without corresponding verb and ''use'' as stop words are ignored. The words used in Temkin (Temkin and Gilder, 2003) and Kabiljo (Kabiljo et al, 2009) are all covered by our ontology. As the discussed hierarchy in WordNet, the meaning of the lower-level word in the relation ontology is more general than that of the corresponding higher-level one, i.e., the lower the level where the …”
Section: Coverage Evaluationmentioning
confidence: 99%
“…We compare our relation ontology with the protein interaction relation words that are extracted from corpora BioInfer (Pyysalo et al, 2007), BioCreAtIvE-PPI , LLL05 , Hakenberg (Hakenberg et al, 2006), RelEx (Fundel et al, 2007), Temkin (Temkin and Gilder, 2003), and Kabiljo (Kabiljo et al, 2009), in which the singular and plural of verb and noun are ignored. As in Table 4, the columns Extracted relation words, Ignored, and Recall represent the total number of extracted relation words, the number of omitted words by our method, and the recall of our ontology that is computed in Formula 3, respectively.…”
Section: Coverage Evaluationmentioning
confidence: 99%
“…Such sentences could be those containing biological-specific names such as drug, gene and/or protein names, or biological processes such as protein-to-protein interaction and DNA evolution. The idea of identifying specific words or entity in texts is still an active research area [19,20,21] because of the diversity in biomedical text as discussed in Section 1.1. To our best knowledge, there has not been much work that has addressed boosting sentences and words as part of a ranking strategy.…”
Section: Sentence and Term-based Boostingmentioning
confidence: 99%
“…We also conduct relation extraction on general named entities, such as "smoking" or "sleep quality". Kabiljo et al (2009) compared pattern-matching techniques against a baseline regular expression approach for gene/protein entity extraction. But existing tools for relation extraction are not as comprehensive as entity recognition tools.…”
Section: Related Workmentioning
confidence: 99%