Reporting of demographic data and representativeness in machine learning models using electronic health records

Bozkurt, Selen; Cahan, Eli M.; Seneviratne, Martin; Sun, Ran; Lossio-Ventura, Juan Antonio; Ioannidis, John P. A.; Hernandez‐Boussard, Tina

doi:10.1093/jamia/ocaa164

Cited by 35 publications

(35 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Past research in other medical fields have revealed that machine learning models that are parameterized and trained on populations of patients with different characteristics than the target population they are to be used on can lead to biased predictions [13][14][15][16][17] . We sought to assess whether this was the case for our KF models as well.…”

Section: J O U R N a L P R E -P R O O Fmentioning

confidence: 99%

Augmenting Kalman Filter Machine Learning Models with Data from OCT to Predict Future Visual Field Loss

Zhalechian

Oyen

Lavieri

et al. 2022

Ophthalmology Science

View full text Add to dashboard Cite

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

show abstract

Section: J O U R N a L P R E -P R O O Fmentioning

confidence: 99%

Augmenting Kalman Filter Machine Learning Models with Data from OCT to Predict Future Visual Field Loss

Zhalechian

Oyen

Lavieri

et al. 2022

Ophthalmology Science

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

“…Inconsistencies in how ML models from electronic health records have also been reported, with details regarding race and ethnicity of participants omitted in 64% of studies, and only 12% of models being externally validated. 11 In order to address these concerns, adapted research reporting guidelines based on the well-established EQUATOR Network (Enhancing the QUAlity and Transparency Of health Research) 12 13 and de novo recommendations by individual societies have been published, with a greater relevance for AI research. In this review, we highlight those that will cover the majority of healthcare focused AI-related studies, and explain how they differ to the well-known guidance for non-AI related clinical work.…”

Section: Introductionmentioning

confidence: 99%

Review of study reporting guidelines for clinical studies using artificial intelligence in healthcare

Shelmerdine

Arthurs

Denniston

et al. 2021

BMJ Health Care Inform

View full text Add to dashboard Cite

High-quality research is essential in guiding evidence-based care, and should be reported in a way that is reproducible, transparent and where appropriate, provide sufficient detail for inclusion in future meta-analyses. Reporting guidelines for various study designs have been widely used for clinical (and preclinical) studies, consisting of checklists with a minimum set of points for inclusion. With the recent rise in volume of research using artificial intelligence (AI), additional factors need to be evaluated, which do not neatly conform to traditional reporting guidelines (eg, details relating to technical algorithm development). In this review, reporting guidelines are highlighted to promote awareness of essential content required for studies evaluating AI interventions in healthcare. These include published and in progress extensions to well-known reporting guidelines such as Standard Protocol Items: Recommendations for Interventional Trials-AI (study protocols), Consolidated Standards of Reporting Trials-AI (randomised controlled trials), Standards for Reporting of Diagnostic Accuracy Studies-AI (diagnostic accuracy studies) and Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis-AI (prediction model studies). Additionally there are a number of guidelines that consider AI for health interventions more generally (eg, Checklist for Artificial Intelligence in Medical Imaging (CLAIM), minimum information (MI)-CLAIM, MI for Medical AI Reporting) or address a specific element such as the ‘learning curve’ (Developmental and Exploratory Clinical Investigation of Decision-AI) . Economic evaluation of AI health interventions is not currently addressed, and may benefit from extension to an existing guideline. In the face of a rapid influx of studies of AI health interventions, reporting guidelines help ensure that investigators and those appraising studies consider both the well-recognised elements of good study design and reporting, while also adequately addressing new challenges posed by AI-specific elements.

show abstract

“…One review examining 164 models described in the scientific literature found low reporting rates of demographic variables such as race (36%) and socioeconomic status (8%) as well as low external validation rates (12%). 43 A critical review of published models for diagnosis and prognosis of COVID-19 found that most models were at high risk of bias due to poor reporting. 44 The purpose of this analysis is to assess whether the documentation available for commonly deployed models provides the information requested by model reporting guidelines.…”

Section: Introductionmentioning

confidence: 99%

“…44 The purpose of this analysis is to assess whether the documentation available for commonly deployed models provides the information requested by model reporting guidelines. Compared to previous work, 43,44 we focus on user-facing product documentation accompanying models. Thus, we are able to analyze models that have been deployed in practice but not yet described in peerreviewed publications.…”

Section: Introductionmentioning

confidence: 99%

Low adherence to existing model reporting guidelines by commonly used clinical prediction models

Callahan

Patel

et al. 2021

Preprint

View full text Add to dashboard Cite

Objective: To assess whether the documentation available for commonly used machine learning models developed by an electronic health record (EHR) vendor provides information requested by model reporting guidelines. Materials and Methods: We identified items requested for reporting from model reporting guidelines published in computer science, biomedical informatics, and clinical journals, and merged similar items into representative "atoms". Four independent reviewers and one adjudicator assessed the degree to which model documentation for 12 models developed by Epic Systems reported the details requested in each atom. We present summary statistics of consensus, interrater agreement, and reporting rates of all atoms for the 12 models. Results: We identified 220 unique atoms across 15 model reporting guidelines. After examining the documentation for the 12 most commonly used Epic models, the independent reviewers had an interrater agreement of 76%. After adjudication, the model documentations' median completion rate of applicable atoms was 39% (range: 31%-47%). Most of the commonly requested atoms had reporting rates of 90% or above, including atoms concerning outcome definition, preprocessing, AUROC, internal validation and intended clinical use. For individual reporting guidelines, the median adherence rate for an entire guideline was 54% (range: 15%-71%). Atoms reported half the time or less included those relating to fairness (summary statistics and subgroup analyses, including for age, race/ethnicity, or sex), usefulness (net benefit, prediction time, warnings on out-of-scope use and when to stop use), and transparency (model coefficients). Atoms reported the least often related to missingness (missing data statistics, missingness strategy), validation (calibration plot, external validation), and monitoring (how models are updated/tuned, prediction monitoring). Conclusion: There are many recommendations about what should be reported about predictive models used to guide care. Existing model documentation examined in this study provides less than half of applicable atoms, and entire reporting guidelines have low adherence rates. Half or less of the reviewed documentation reported information related to usefulness, reliability, transparency and fairness of models. There is a need for better operationalization of reporting recommendations for predictive models in healthcare.

show abstract

Reporting of demographic data and representativeness in machine learning models using electronic health records

Cited by 35 publications

References 31 publications

Augmenting Kalman Filter Machine Learning Models with Data from OCT to Predict Future Visual Field Loss

Augmenting Kalman Filter Machine Learning Models with Data from OCT to Predict Future Visual Field Loss

Review of study reporting guidelines for clinical studies using artificial intelligence in healthcare

Low adherence to existing model reporting guidelines by commonly used clinical prediction models

Contact Info

Product

Resources

About