The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2016
DOI: 10.1016/j.jbi.2015.12.010
|View full text |Cite
|
Sign up to set email alerts
|

An unsupervised learning method to identify reference intervals from a clinical database

Abstract: Reference intervals are critical for the interpretation of laboratory results. The development of reference intervals using traditional methods is time consuming and costly. An alternative approach, known as an a posteriori method, requires an expert to enumerate diagnoses and procedures that can affect the measurement of interest. We develop a method, LIMIT, to use laboratory test results from a clinical database to identify ICD9 codes that are associated with extreme laboratory results, thus automating the a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
24
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 45 publications
(25 citation statements)
references
References 39 publications
1
24
0
Order By: Relevance
“…Some studies have used repeat testing as an assumption of illness and, therefore, an exclusion criterion (11,12), whereas Kouri et al (1) used discharge diagnoses related to the analyte of interest to exclude patients. Poole et al (13) recently used unsupervised computer learning to identify ICD9 codes associated with extreme laboratory values and then used these codes to exclude individuals from the reference population. Here, we demonstrate how data mining coupled with disease-specific exclusion criteria was used in the a posteriori sampling technique to select a large reference population of euthyroid patients from the laboratory database to establish age-based thyroid-stimulating hormone (TSH) reference intervals.…”
mentioning
confidence: 99%
“…Some studies have used repeat testing as an assumption of illness and, therefore, an exclusion criterion (11,12), whereas Kouri et al (1) used discharge diagnoses related to the analyte of interest to exclude patients. Poole et al (13) recently used unsupervised computer learning to identify ICD9 codes associated with extreme laboratory values and then used these codes to exclude individuals from the reference population. Here, we demonstrate how data mining coupled with disease-specific exclusion criteria was used in the a posteriori sampling technique to select a large reference population of euthyroid patients from the laboratory database to establish age-based thyroid-stimulating hormone (TSH) reference intervals.…”
mentioning
confidence: 99%
“…This may simply involve excluding values beyond an arbitrary limit, such as those more than 10 times the upper reference limit 13 or involve a statistical test, such as that of Tukey, or others. [14][15][16] Another data pre-processing step that may be used is the exclusion of data from particular referral sites where there is a high likelihood that the patients have significant disease, such as intensive care units and oncology departments. 13,17 It may also be appropriate to exclude data from additional referral sites depending on the analyte of interest, for example lipid and renal clinics.…”
Section: Data Pre-processingmentioning
confidence: 99%
“…This was done by the Laboratory Mining for Individualized Threshold (LIMIT) study, which used an unsupervised machine learning algorithm to identify diagnostic codes that were significantly associated with outlier results for the analyte of interest. 14 The 'learning' component of the algorithm involved setting values for 4 parameters (one of which, for instance, governed the sensitivity to outlier detection). These values were set using data for serum sodium because of its well-established reference interval.…”
Section: Subjects With Disease Excluded From the Extracted Datamentioning
confidence: 99%
“…In this case, the very notion of reference limits would change, and ML, by leveraging and improving other statistical approaches, could help limit the misinterpretation of values outside of reference limits or of apparently normal data but also diagnostic for some conditions (e.g. [25]). Furthermore, some envision an ML-based clinical decision support that, by predicting correlated test results and enhancing the diagnostic value of multianalyte sets of test results, could help to reduce redundant laboratory testing [26] and, hence, lower healthcare costs, which are estimated to total $5 billion yearly in the United States alone [27].…”
Section: Review Of Machine Learning In Medicinementioning
confidence: 99%