2008
DOI: 10.1186/gb-2008-9-s2-s2
|View full text |Cite
|
Sign up to set email alerts
|

Overview of BioCreative II gene mention recognition

Abstract: Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
314
0

Year Published

2009
2009
2017
2017

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 356 publications
(316 citation statements)
references
References 16 publications
0
314
0
Order By: Relevance
“…Conditional random fields (CRFs) [15] achieve state-ofthe-art performance across a broad spectrum of sequence labeling tasks, including the gene mention task in the BioCreativeII competition [13]. Linear-chain CRFs are discriminative probabilistic models over observation sequences x and label sequences y = (y 1 , ..., y |y| ), where |x| = |y|, and each label y i has K different possible discrete values (multi-class).…”
Section: B Conditional Random Fieldmentioning
confidence: 99%
See 3 more Smart Citations
“…Conditional random fields (CRFs) [15] achieve state-ofthe-art performance across a broad spectrum of sequence labeling tasks, including the gene mention task in the BioCreativeII competition [13]. Linear-chain CRFs are discriminative probabilistic models over observation sequences x and label sequences y = (y 1 , ..., y |y| ), where |x| = |y|, and each label y i has K different possible discrete values (multi-class).…”
Section: B Conditional Random Fieldmentioning
confidence: 99%
“…We then test SLF using conditional random fields (using CRF++ tool) as a base classifier on the gene-namerecognition (GM) data set (the number of tokens in training/testing/unlabeled sets in Table III) from the 2006 BioCreativeII BioNLP competition [13]. For this GM task, we utilized the following word features: (i) words in a 5-word-window surrounding current, (ii) capitalization features of current and surrounding words, (iii) prefix and suffix (up to length 4) of current and surrounding words, (iv) string patterns of current and surrounding words.…”
Section: Experiments a Datasets And Settingsmentioning
confidence: 99%
See 2 more Smart Citations
“…These linguistic issues are often handled using rules. But, except very few attempts (Califf and Mooney, 1999;Smith et al, 2008), such rules are manually elaborated and texts, which can be processed are necessarily specific and limited. Furthermore, machine learning (ML) based methods such as support vector machines, conditional random fields, etc., (Smith et al, 2008) need many features and their outcomes are not really understandable by a user.…”
Section: Introductionmentioning
confidence: 99%