2016
DOI: 10.1093/database/baw068
|View full text |Cite
|
Sign up to set email alerts
|

BioCreative V CDR task corpus: a resource for chemical disease relation extraction

Abstract: Community-run, formal evaluations and manually annotated text corpora are critically important for advancing biomedical text-mining research. Recently in BioCreative V, a new challenge was organized for the tasks of disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. Given the nature of both tasks, a test collection is required to contain both disease/chemical annotations and relation annotations in the same set of articles. Despite previous efforts in biomedical cor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
401
0
1

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 526 publications
(407 citation statements)
references
References 28 publications
0
401
0
1
Order By: Relevance
“…Finally, CTD continues its commitment to spearhead and advance biomedical text-mining research for the scientific community by teaming with the National Center for Biotechnology Information (NCBI) to organize a BioCreative community challenge which focused on developing tools to identify and extract specific disease and chemical content (11). For this endeavor, we helped develop a large corpus of manually curated annotations for chemicals, diseases and their interactions from 1500 PubMed articles (12); this corpus is freely available (download by clicking: http://sourceforge.net/projects/bioc/files/CDR_Data.zip/download), as are many of the associated text-mining tools developed by the 25 teams that participated.…”
Section: New Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, CTD continues its commitment to spearhead and advance biomedical text-mining research for the scientific community by teaming with the National Center for Biotechnology Information (NCBI) to organize a BioCreative community challenge which focused on developing tools to identify and extract specific disease and chemical content (11). For this endeavor, we helped develop a large corpus of manually curated annotations for chemicals, diseases and their interactions from 1500 PubMed articles (12); this corpus is freely available (download by clicking: http://sourceforge.net/projects/bioc/files/CDR_Data.zip/download), as are many of the associated text-mining tools developed by the 25 teams that participated.…”
Section: New Featuresmentioning
confidence: 99%
“…These data are further associated with external data sets to establish novel, statistically ranked inferences between diverse types of information (67). Additionally, as part of our continued, active engagement with the scientific community, CTD plays a significant role in advancing text-mining methods for biomedical information as part of the BioCreative consortium (812), facilitates the development of semantic standards for the environmental health science community (13), complies with reporting standards set by the BioSharing Information Resources (14) and is a registered member (https://biosharing.org/biodbcore-000173) of BioDBcore (15). …”
Section: Introductionmentioning
confidence: 99%
“…As shown in Table 3, the proposed model has achieved the best performance with the highest F-score 58.8%, which proved that our model was better than a traditional machine learning or rule-based method. 29 UET-CAM firstly resolved co-references by using a multi-pass sieve to identify cross-sentence references for entities and then extracted drug-disease relations by using SVM, which achieved better performance. We only dealt with the second subtask.…”
Section: Results Analysismentioning
confidence: 99%
“…Table 4 shows the performance of different methods on the CID task. 29 UET-CAM firstly resolved co-references by using a multi-pass sieve to identify cross-sentence references for entities and then extracted drug-disease relations by using SVM, which achieved better performance. (UTH-CCB) combined two SVM-based classifiers, which were trained on sentence-level and document-level.…”
Section: 4mentioning
confidence: 99%
“…Spice/herb names in abstracts were tagged using a dictionary matching method. For recognizing and normalizing disease names in the text, we used TaggerOne (Leaman and Lu, 2016) which has a reported precision of 85% and recall of 80% on the Biocreative V Chemical Disease Relation test set (Li et al, 2016). We used the pre-trained disease-only model available with TaggerOne (Leaman and Lu, 2016) on our data.…”
Section: Methodsmentioning
confidence: 99%