2019
DOI: 10.1504/ijcmh.2019.104365
|View full text |Cite
|
Sign up to set email alerts
|

Extraction of breast cancer biomarker status using natural language processing

Abstract: We employed natural language processing (NLP) algorithms to extract estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor 2 (HER2) receptor status for females with breast cancer using unstructured (free text) EMR data, and to determine the prevalence of triple negative breast cancer in the Indiana network for patient care (INPC) population. We identified female patients in INPC with a history of breast cancer over a ten year period who had at least five oncology notes or one rel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
6
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(7 citation statements)
references
References 6 publications
1
6
0
Order By: Relevance
“…This study was designed to evaluate the possibility of automatically extracting the status of the 3 main breast cancer biomarkers (ER, PR, and HER2) from the contents of pathology reports written in two different languages, and coming from 82 different providers, using conventional machine learning models. After testing different classifiers, the best performing ones achieved macro-averaged F 1 scores ranging from 0.89 to 0.92 on the held-out test sets, which is on par with best efforts in the literature (6,7,11,12). The reported F 1 scores in the literature range between 0.87 and 1, but use only three possible labels for HER2, whereas five are used in the present work.…”
Section: Discussionsupporting
confidence: 58%
See 4 more Smart Citations
“…This study was designed to evaluate the possibility of automatically extracting the status of the 3 main breast cancer biomarkers (ER, PR, and HER2) from the contents of pathology reports written in two different languages, and coming from 82 different providers, using conventional machine learning models. After testing different classifiers, the best performing ones achieved macro-averaged F 1 scores ranging from 0.89 to 0.92 on the held-out test sets, which is on par with best efforts in the literature (6,7,11,12). The reported F 1 scores in the literature range between 0.87 and 1, but use only three possible labels for HER2, whereas five are used in the present work.…”
Section: Discussionsupporting
confidence: 58%
“…Within the context of these activities, having the breast cancer receptor status at its disposal would undoubtedly be of added value. In breast cancer, estrogen receptor (ER), progesterone receptor (PR), and Erb-b2 receptor tyrosine kinase 2 (ERBB2, previously named Human Epidermal Growth Factor 2 or HER2 or HER-2/neu 2 ) are biomarkers known to be related to tumor growth and prognosis, and assessing their expression is necessary to define therapeutic management (3)(4)(5)(6)(7). Currently this information is not available in a structured form at the BCR.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations