2021
DOI: 10.1186/s40779-021-00338-z
|View full text |Cite
|
Sign up to set email alerts
|

Data mining in clinical big data: the frequently used databases, steps, and methodological models

Abstract: Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
230
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
9

Relationship

5
4

Authors

Journals

citations
Cited by 298 publications
(275 citation statements)
references
References 95 publications
1
230
0
2
Order By: Relevance
“…It contains a comprehensive information on more than 250,000 electronic admission records of the Beth Israel Deaconess Medical Center in Boston, Massachusetts from 2008 to 2019. These records include the diagnosis, vital signs, laboratory tests, medication, and surgical information [ 18 , 19 ]. The data used in this study were from the latest version of MIMI-IV 1.0, which was released in March 2021.…”
Section: Methodsmentioning
confidence: 99%
“…It contains a comprehensive information on more than 250,000 electronic admission records of the Beth Israel Deaconess Medical Center in Boston, Massachusetts from 2008 to 2019. These records include the diagnosis, vital signs, laboratory tests, medication, and surgical information [ 18 , 19 ]. The data used in this study were from the latest version of MIMI-IV 1.0, which was released in March 2021.…”
Section: Methodsmentioning
confidence: 99%
“…The continuous variable was expressed as median and quartile spacing, while the categorical variable was expressed as percentage. The predictive variables were selected using stepwise selection method in the Cox regression model in the training cohort, 19 and the 3-, 5-, and 8-year OS rates for patients with ALM were constructed on the basis of the identified variables.…”
Section: Methodsmentioning
confidence: 99%
“…MIMIC is a large, single-center, open-access database. MIMIC-III includes data on more than 58,000 admissions to Beth Israel Deaconess Medical Center in Boston from 2001 to 2012, comprising 38,645 adults and 7,875 newborns ( 14 16 )And MIMIC-IV covers 524,740 admissions for 382,278 patients to this center from 2008 to 2019 ( 17 , 18 ). The relevant records include demographic data, hourly vital signs, laboratory test results, microbial culture results, imaging data, treatment procedures, medication records, and survival information.…”
Section: Methodsmentioning
confidence: 99%
“…LCEV patient data were obtained from Medical Information Mart for Intensive Care III and IV (MIMIC-III v1. 4 (17,18). The relevant records include demographic data, hourly vital signs, laboratory test results, microbial culture results, imaging data, treatment procedures, medication records, and survival information.…”
Section: Data Sourcementioning
confidence: 99%