2017
DOI: 10.1200/jco.2017.35.8_suppl.232
|View full text |Cite
|
Sign up to set email alerts
|

A natural language processing algorithm to measure quality prostate cancer care.

Abstract: 232 Background: Electronic health records (EHRs) are a widely adopted but underutilized source of data for systematic assessment of healthcare quality. Barriers for use of this data source include its vast complexity, lack of structure, and the lack of use of standardized vocabulary and terminology by clinicians. This project aims to develop generalizable algorithms to extract useful knowledge regarding prostate cancer quality metrics from EHRs. Methods: We used EHR ICD-9/10 codes to identify prostate cancer … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…We have demonstrated the feasibility of our data-mining workflow to extract accurate, clinically meaningful information from EHR. [25][26][27][28] The key for data extraction is to transform patient encounters into a retrospective longitudinal record for each patient and identify cohorts of interest, known as clinical phenotyping, 17 using structured and unstructured data. The custom extractors we develop range in complexity based on the types of data and analytic methods required to identify and pull each variable at high fidelity.…”
Section: Data Miningmentioning
confidence: 99%
“…We have demonstrated the feasibility of our data-mining workflow to extract accurate, clinically meaningful information from EHR. [25][26][27][28] The key for data extraction is to transform patient encounters into a retrospective longitudinal record for each patient and identify cohorts of interest, known as clinical phenotyping, 17 using structured and unstructured data. The custom extractors we develop range in complexity based on the types of data and analytic methods required to identify and pull each variable at high fidelity.…”
Section: Data Miningmentioning
confidence: 99%
“…Frame Elements References CODE TERMINOLOGY: clinical terminology used for the procedure mention [52] CODE VALUE: value of the terminology code [52], [1] INSTITUTION: institution where the procedure was performed [52] NEGATION: whether the mention is negated [52], [77] MENTION: words related to any procedure term (e.g. flex sig, guaiac card) [75], [52], [15], [84], [77], [43] MARGIN: usually the rim of normal tissue taken removed during or after procedure (surgical margin) [52] ANATOMICAL SITE: part of body procedure targets (e.g., breast) [79], [52] TEMPORAL INFORMATION: time and date descriptors (e.g., "colonoscopy in 2005", "flexible sigmoidoscopy 5 years ago), date of completion [52], [15], [1], [10] STATUS: procedure or treatment status (e.g., refused, declined, scheduled, planned, completed, reported vs not reported) [15], [42], [1], [82] MODIFIER: negation and other modifiers that change the status of procedure (e.g., "no", "never") [15] TUMOR DESCRIPTION ANATOMICAL SITE: anatomic locations (e.g., "segment 5" or "left lobe") with attributes (Liver, Non Liver), target location (e.g., liver and segment #7) as well as non-target location (e.g., breast) [13], [8], [52], [85], [42], [25], [9], [23], [26], [41] LATERALITY: side of a paired organ associated with origin of the primary tumor [25] TYPE: primary/metastatic [25] STATUS: benign or malignancy status along with diagno...…”
Section: Framementioning
confidence: 99%
“…[79] TEST RESULT: result of the test (e.g. positive vs negative) [14], [82], [94], [83], [77] THERAPEUTIC PROCEDURE TYPE: treatment type (e.g., RFA and TACE) [42], [60], [50] LINE OF THERAPY: initial treatment is referred to as first-line treatment or first-line therapy, however, a secondline treatment may be suggested later [50] THERAPY DOSE: the total about of treatment (e.g., radiation) the patient is exposed to (e.g. radiation therapy dose) [10] TOXICITIES: various toxicities related to cancer treatment therapy along with negation and certainty [10] CANCER FINDING PROCEDURE: name of the associated cancer procedure (e.g., Breast Core biopsy), can be a separate frame (CANCER PROCEDURE) [7], [79], [71], [81], [11] FINDING TYPE: type of the finding (e.g., negative, normal, positive, possible, [80] Frame Frame Elements References probably, history, mild, stable, improved, or recommendation) FINDING MODIFIER: modifying words from within the report (e.g., type: mild, modifiers: tortuous) [80] LATERALITY: location or sidedness of the finding (e.g., "left", "right," "both", or "bilateral") [79] BODY PARTS: body organs on which the finding is reported [11] PATHOLOGY FINDING POSITIVE LYMPH NODES NUMBER: number of positive lymph nodes mentioned in the finding [71], [48], [65] POSITIVE LYMPH NODES STATUS: presence or absence of positive lymph nodes [64] LYMPH NODES REMOVED: number of lymph nodes removed [48] NUCLEAR GRADE: describes how closely the nuclei of cancer cells look like the nuclei of normal cells [71] PLOIDY: refers to amount of DNA the cancer cells contain [71] QUALITATIVE S-PHASE: indicator of tumor growth rate [71] BIOMARKER: name of the biomarker [12], [40] BIOMARKER TEST RESULTS / MUTATION STATUS: positive or negative for biomarkers...…”
Section: Framementioning
confidence: 99%
See 1 more Smart Citation
“…Since biomedical domains exhibit high degree of terminological variation, NLP’s precision becomes even more valuable based on its ability to automatically recognise all variants of domain language 15. There are a few studies16–20 using NLP techniques to extract concepts related to prostate cancer from EHRs for decision support, but the accuracy of DRE extraction was reported only in two studies 16 21. The features related to DRE reporting was also missing in those studies.…”
Section: Introductionmentioning
confidence: 99%