2019
DOI: 10.1200/cci.19.00008
|View full text |Cite
|
Sign up to set email alerts
|

Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics

Abstract: PURPOSE Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 20 publications
(14 citation statements)
references
References 16 publications
0
11
0
Order By: Relevance
“…Noteworthy developments include information extraction pipelines which utilize regular expressions (regex), to highlight key report findings (eg. extraction of molecular test results) 2023 , as well as topic modeling approaches that summarize a document corpus by common themes and wording 24 . In addition to extraction methods, machine learning techniques have been applied to classify pathologist reports 25 ; notable examples include prediction of ICD-O morphological diagnostic codes 26,27 and prediction of CPT codes based only on diagnostic text 28,29 .…”
Section: Background and Significancementioning
confidence: 99%
“…Noteworthy developments include information extraction pipelines which utilize regular expressions (regex), to highlight key report findings (eg. extraction of molecular test results) 2023 , as well as topic modeling approaches that summarize a document corpus by common themes and wording 24 . In addition to extraction methods, machine learning techniques have been applied to classify pathologist reports 25 ; notable examples include prediction of ICD-O morphological diagnostic codes 26,27 and prediction of CPT codes based only on diagnostic text 28,29 .…”
Section: Background and Significancementioning
confidence: 99%
“…Our method is suitable for dealing with overall organs, as opposed to merely the target organ. Oliwa et al developed an ML-based model using named-entity recognition to extract specimen attributes 26 . Our model could extract not only specimen keywords but procedure and pathology ones as well.…”
Section: Discussionmentioning
confidence: 99%
“…The analysis of pathology reports using NLP has been particularly impactful in recent years, particularly in the areas of information extraction, summarization, and categorization. Noteworthy developments include information extraction pipelines that utilize regular expressions (regex), to highlight key report findings (e.g., extraction of molecular test results),[ 20 21 22 23 ] as well as topic modeling approaches that summarize a document corpus by common themes and wording. [ 24 ] In addition to extraction methods, machine-learning techniques have been applied to classify pathologist reports[ 25 ]; notable examples include the prediction of ICD-O morphological diagnostic codes[ 26 27 ] and the prediction of CPT codes based only on diagnostic text.…”
Section: B Ackground and S Ignificancementioning
confidence: 99%