2004
DOI: 10.1093/nar/gkh162
|View full text |Cite
|
Sign up to set email alerts
|

Automatic extraction of mutations from Medline and cross-validation with OMIM

Abstract: Mutations help us to understand the molecular origins of diseases. Researchers, therefore, both publish and seek disease-relevant mutations in public databases and in scientific literature, e.g. Medline. The retrieval tends to be time-consuming and incomplete. Automated screening of the literature is more efficient. We developed extraction methods (called MEMA) that scan Medline abstracts for mutations. MEMA identified 24,351 singleton mutations in conjunction with a HUGO gene name out of 16,728 abstracts. Fro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
79
1

Year Published

2007
2007
2015
2015

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 87 publications
(80 citation statements)
references
References 14 publications
0
79
1
Order By: Relevance
“…The capture of the reference sequence data from literature before or after publication might be helpful. Recently, several natural language tools have been developed to identify protein names and mutation terms from Medline abstract or full text articles, such as MutationFinder (Caporaso et al, 2007), MEMA (Rebholz-Schuhmann et al, 2004), mSTRAP (Kanagasabai et al, 2007) and MuteXt (Horn et al, 2004). Verifying the extracted mutation information based on corresponding position within the protein sequence will improve their performance.…”
Section: Discussionmentioning
confidence: 99%
“…The capture of the reference sequence data from literature before or after publication might be helpful. Recently, several natural language tools have been developed to identify protein names and mutation terms from Medline abstract or full text articles, such as MutationFinder (Caporaso et al, 2007), MEMA (Rebholz-Schuhmann et al, 2004), mSTRAP (Kanagasabai et al, 2007) and MuteXt (Horn et al, 2004). Verifying the extracted mutation information based on corresponding position within the protein sequence will improve their performance.…”
Section: Discussionmentioning
confidence: 99%
“…A 2004 study developed regular expressions to extract mutations from MEDLINE abstracts [Rebholz-Schuhmann et al, 2004]. Two subsequent studies used pipelines to extract mutations automatically from full-length publications [Baker and Witte, 2006;Lee et al, 2007].…”
Section: Assessment Of Text Mining Approachesmentioning
confidence: 99%
“…The advantage of this technique is that it does not require the protein name to be in the same sentence as the mutation. The 2004 study that extracted mutations within MEDLINE abstracts also deciphered between protein-mutation pairs using syntactical and proximity parameters [Rebholz-Schuhmann et al, 2004]. Within our mutation-finding tool in the present study, the occurrences of the protein words are highlighted, and the most commonly occurring protein is selected.…”
Section: Assessment Of Text Mining Approachesmentioning
confidence: 99%
“…6,[8][9][10][11][12] MuteXt, 8 MEMA, 9 and Mutation GraB 10 attempt to extract mentions of mutations paired with a specific gene or gene product from input texts. OSIRIS 11 is a web-based information retrieval system for compiling the mutation literature using a concept-driven, mutation-recognition approach.…”
Section: Point Mutation Recognitionmentioning
confidence: 99%
“…Like the earlier mutation recognition systems, [8][9][10] MutationFinder applies a set of regular expressions to identify mutation mentions in input texts. Our currently top-performing collection of regular expressions results in a precision of 98.4% and a recall of 81.9% when extracting mutation mentions from completely blind test data.…”
Section: Mutationfindermentioning
confidence: 99%