2018
DOI: 10.1186/s13326-017-0173-6
|View full text |Cite
|
Sign up to set email alerts
|

CUILESS2016: a clinical corpus applying compositional normalization of text mentions

Abstract: BackgroundTraditionally text mention normalization corpora have normalized concepts to single ontology identifiers (“pre-coordinated concepts”). Less frequently, normalization corpora have used concepts with multiple identifiers (“post-coordinated concepts”) but the additional identifiers have been restricted to a defined set of relationships to the core concept. This approach limits the ability of the normalization process to express semantic meaning. We generated a freely available corpus using post-coordina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 14 publications
(31 reference statements)
0
1
0
Order By: Relevance
“…To cover temporal reasoning tasks [ 60 ], our annotation schema will be expanded to create a subset of this corpus with temporal annotation. The availability of the corpus to the scientific community will allow not only our research group, but other researchers to complement and adapt the SemClinBr annotations according to their needs, without having to start an annotation process from scratch, as Osborne et al (2018) [ 61 ] did when normalizing the ShARe corpus, or Wagholikar et al who used pooling techniques to reuse corpora across institutions [ 62 ]. Furthermore, the effect of the corpus homogenization process on the performance of these NLP/ML algorithms needs to be determined.…”
Section: Discussionmentioning
confidence: 99%
“…To cover temporal reasoning tasks [ 60 ], our annotation schema will be expanded to create a subset of this corpus with temporal annotation. The availability of the corpus to the scientific community will allow not only our research group, but other researchers to complement and adapt the SemClinBr annotations according to their needs, without having to start an annotation process from scratch, as Osborne et al (2018) [ 61 ] did when normalizing the ShARe corpus, or Wagholikar et al who used pooling techniques to reuse corpora across institutions [ 62 ]. Furthermore, the effect of the corpus homogenization process on the performance of these NLP/ML algorithms needs to be determined.…”
Section: Discussionmentioning
confidence: 99%