2017
DOI: 10.1186/s12859-017-1775-9
|View full text |Cite
|
Sign up to set email alerts
|

Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles

Abstract: BackgroundCoreference resolution is the task of finding strings in text that have the same referent as other strings. Failures of coreference resolution are a common cause of false negatives in information extraction from the scientific literature. In order to better understand the nature of the phenomenon of coreference in biomedical publications and to increase performance on the task, we annotated the Colorado Richly Annotated Full Text (CRAFT) corpus with coreference relations.ResultsThe corpus was manuall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
45
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 48 publications
(45 citation statements)
references
References 59 publications
(40 reference statements)
0
45
0
Order By: Relevance
“…We employed two biomedical corpora: BioNLP Protein Coreference dataset (Nguyen et al, 2011) and CRAFT (Cohen et al, 2017). The BioNLP dataset consists of 1,210 PubMed abstracts selected from the GENIA-MedCo coreference corpus.…”
Section: Datamentioning
confidence: 99%
See 3 more Smart Citations
“…We employed two biomedical corpora: BioNLP Protein Coreference dataset (Nguyen et al, 2011) and CRAFT (Cohen et al, 2017). The BioNLP dataset consists of 1,210 PubMed abstracts selected from the GENIA-MedCo coreference corpus.…”
Section: Datamentioning
confidence: 99%
“…The BioNLP dataset consists of 1,210 PubMed abstracts selected from the GENIA-MedCo coreference corpus. CRAFT (Cohen et al, 2017) erence annotations of 67 full papers extracted from PMC. While BioNLP focusses on protein/gene coreference, CRAFT covers a wider range of coreference relations such as events, pronominal anaphora, noun phrases, verbs, and nominal premodifiers corefernce.…”
Section: Datamentioning
confidence: 99%
See 2 more Smart Citations
“…Coreference resolution is important not only in general domains but also in the biomedical domain. The Colorado Richly Annotated Full Text (CRAFT) corpus (Cohen et al, 2017) was constructed with an aim of boosting the performance of the task in the biomedical literature. Unlike other corpora, CRAFT is comprised of full text articles or full papers, its coreferent chains are arbitrarily long; the mean length of coreferent chains is 4 while the longest chain is 186, which makes the resolution even more difficult than usual.…”
Section: Introductionmentioning
confidence: 99%