2017
DOI: 10.1016/j.jbi.2017.04.015
|View full text |Cite
|
Sign up to set email alerts
|

Towards generalizable entity-centric clinical coreference resolution

Abstract: Objective This work investigates the problem of clinical coreference resolution in a model that explicitly tracks entities, and aims to measure the performance of that model in both traditional in-domain train/test splits and cross-domain experiments that measure the generalizability of learned models. Methods The two methods we compare are a baseline mention-pair coreference system that operates over pairs of mentions with best-first conflict resolution and a mention-synchronous system that incrementally bu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…On the other hand, out of 16 papers involving publicly available corpora, 12 exploit the Informatics for Integrating Biology and the Bedside (i2b2) datasets. The other 4 public datasets used are MIMIC-II [107], PhenoCHF [116], Temporal Histories of Your Medical Event (THYME), and Cancer Deep Phenotype Extraction (DeepPhe) [76].…”
Section: Natural Language Processing Tasks Methods and Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, out of 16 papers involving publicly available corpora, 12 exploit the Informatics for Integrating Biology and the Bedside (i2b2) datasets. The other 4 public datasets used are MIMIC-II [107], PhenoCHF [116], Temporal Histories of Your Medical Event (THYME), and Cancer Deep Phenotype Extraction (DeepPhe) [76].…”
Section: Natural Language Processing Tasks Methods and Datasetsmentioning
confidence: 99%
“…Carrell et al [66] proposed an NLP system to process clinical text to identify breast cancer recurrences, while Castro et al [22] addressed the automated Breast Imaging-Reporting and Data System (BI-RADS) categories extraction from breast radiology reports. Miller et al [76] proposed a tool for coreference resolution in clinical texts evaluated within the domain (colon cancer) and between domains (breast cancer). Mykowiecka et al [77] propose a rule-based IE system evaluated on mammography reports.…”
Section: Breast Cancermentioning
confidence: 99%
“…With histories involving dozens of relevant notes (one dataset used in DeepPhe averaged 30 notes/patient, after clearly irrelevant notes were removed), manual expert review will not be sufficient for the large-scale analyses needed to drive innovation. Although advances in cross-document coreference 15 and other techniques currently being explored by the DeepPhe project show great promise in increasing the utility of clinical text, NLP is only a first step, providing an intermediate representation not directly consumable by end-users.…”
Section: Discussionmentioning
confidence: 99%
“…In addition, as with other AI algorithms, developing supervised NLP machine learning models almost always requires large gold annotated data sets from which the algorithm can learn. There are few publicly available corpora of gold annotated clinical text 11,63,93,94,[118][119][120][121][122][123][124][125][126][127][128] and no publicly available corpora of radiation oncology clinical documentation. Although some labeled data in these public corpora are translatable to radiation oncology, such as disease site, signs and symptoms, and comorbidities, the highly unique terminology and jargon used frequently in radiation oncology but rarely in other medical domains, such as "gross/ clinical/planning tumor volume," "boost," "point A/B," and "Gy," may require large radiation oncologyespecific gold annotated corpora.…”
Section: Outstanding Challenges Of Nlp In Radiation Oncologymentioning
confidence: 99%