2018
DOI: 10.1148/radiol.2018171093
|View full text |Cite
|
Sign up to set email alerts
|

Natural Language–based Machine Learning Models for the Annotation of Clinical Radiology Reports

Abstract: Purpose To compare different methods for generating features from radiology reports and to develop a method to automatically identify findings in these reports. Materials and Methods In this study, 96 303 head computed tomography (CT) reports were obtained. The linguistic complexity of these reports was compared with that of alternative corpora. Head CT reports were preprocessed, and machine-analyzable features were constructed by using bag-of-words (BOW), word embedding, and Latent Dirichlet allocation-based … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
106
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 123 publications
(108 citation statements)
references
References 31 publications
2
106
0
Order By: Relevance
“…NLP-generated Radiograph Annotation and Labeling Annotation of the radiographs was automated by developing an NLP system that was able to process and map the language used in each radiology report (16). The architecture of our NLP system was somewhat similar to the architecture described by Cornegruta et al (17) and Pesce et al (18).…”
Section: Data Setmentioning
confidence: 99%
“…NLP-generated Radiograph Annotation and Labeling Annotation of the radiographs was automated by developing an NLP system that was able to process and map the language used in each radiology report (16). The architecture of our NLP system was somewhat similar to the architecture described by Cornegruta et al (17) and Pesce et al (18).…”
Section: Data Setmentioning
confidence: 99%
“…Researchers have attempted to use data mining and natural language processing of the electronic health record (EHR) and the picture archiving and communication system (PACS) for extracting clinical data and diagnosis from the physicians' and pathology reports . The accuracy of the retrieved labels depends on the methods used . It has been shown that automatically mined disease labels or annotations can contain substantial noise .…”
Section: Deep Learning Approach To Cadmentioning
confidence: 99%
“…where p ij is the similarity of two points in a high-dimensional space. t-SNE has been widely used in image processing, natural language processing, genomic data analysis, and speech processing [24][25][26].…”
Section: Visualizationmentioning
confidence: 99%