2019
DOI: 10.7554/elife.44941
|View full text |Cite
|
Sign up to set email alerts
|

Linking glycemic dysregulation in diabetes to symptoms, comorbidities, and genetics through EHR data mining

Abstract: Diabetes is a diverse and complex disease, with considerable variation in phenotypic manifestation and severity. This variation hampers the study of etiological differences and reduces the statistical power of analyses of associations to genetics, treatment outcomes, and complications. We address these issues through deep, fine-grained phenotypic stratification of a diabetes cohort. Text mining the electronic health records of 14,017 patients, we matched two controlled vocabularies (ICD-10 and a custom vocabul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
2

Relationship

2
8

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 64 publications
(74 reference statements)
1
11
0
Order By: Relevance
“…First, our population of diabetic patients, made of Italians and undocumented migrants, appeared to be divided into five clusters on the basis of their clinical, anamnestic and demographic features, but not of ethnicity (passive variable). The fact that patients with T2D follow into different clusters is not surprising and has been demonstrated by other authors [4,25]. Complications of T2D follow a similar pattern and show differences among different clusters [26], and in turn the analysis of risk factors for T2D also generates different clusters with different prevalence and clinical features of T2D [27].…”
Section: Discussionmentioning
confidence: 58%
“…First, our population of diabetic patients, made of Italians and undocumented migrants, appeared to be divided into five clusters on the basis of their clinical, anamnestic and demographic features, but not of ethnicity (passive variable). The fact that patients with T2D follow into different clusters is not surprising and has been demonstrated by other authors [4,25]. Complications of T2D follow a similar pattern and show differences among different clusters [26], and in turn the analysis of risk factors for T2D also generates different clusters with different prevalence and clinical features of T2D [27].…”
Section: Discussionmentioning
confidence: 58%
“…Jensen et al ( 2012) also described the issue of unstructured text data in EHRs, pointing out that improvements in text mining techniques were making these parts of the EHR more accessible for data mining 18 . Kirk et al ( 2019) implemented a text-mining approach to match patient records with two clinical vocabularies to perform data clustering on 14,017 patients 19 . In addition to clinical features, administrative factors such as length of stay in the hospital have been examined with the aid of EHR data as well 20 .…”
Section: Discussionmentioning
confidence: 99%
“…Four papers applied complex ML methods to a set of less than ten clinical variables such as systolic blood pressure, waist circumference, BMI, fasting plasma glucose, and age at diabetes diagnosis, and resulting subgroups were associated with outcomes such as mortality. Eleven studies used a larger set of more than ten clinical features as inputs for classification, including data from electronic health records 28,29 , and identified subgroups with associated clinical outcomes, including risk of cardiovascular disease. Two other studies specifically employed cardiovascular traits including ECG 30 and echocardiographic 31 for ML algorithm inputs, and each identified subgroups with different associations with risk of cardiovascular disease.…”
Section: Description Of the Categorised Subgroupsmentioning
confidence: 99%