2022
DOI: 10.1001/jamanetworkopen.2022.27779
|View full text |Cite
|
Sign up to set email alerts
|

Assessment of Adherence to Reporting Guidelines by Commonly Used Clinical Prediction Models From a Single Vendor

Abstract: IMPORTANCE Various model reporting guidelines have been proposed to ensure clinical prediction models are reliable and fair. However, no consensus exists about which model details are essential to report, and commonalities and differences among reporting guidelines have not been characterized. Furthermore, how well documentation of deployed models adheres to these guidelines has not been studied. OBJECTIVESTo assess information requested by model reporting guidelines and whether the documentation for commonly … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 28 publications
(23 citation statements)
references
References 72 publications
0
16
0
Order By: Relevance
“…Data extraction adhered to the CHARMS (Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies) and PRISMA guidelines. 19 , 544 We used a closed-loop cross-sequential design for quality control on data extraction and screening (5 assessors). Details on data extraction are provided in eMethods 2 and 3 in Supplement 1 .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Data extraction adhered to the CHARMS (Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modeling Studies) and PRISMA guidelines. 19 , 544 We used a closed-loop cross-sequential design for quality control on data extraction and screening (5 assessors). Details on data extraction are provided in eMethods 2 and 3 in Supplement 1 .…”
Section: Methodsmentioning
confidence: 99%
“…In addition, poor reporting quality impedes the establishment of better practices in AI-based psychiatric diagnosis. Despite the developments of multifarious benchmarks for reporting clinical prediction models (eg, STARD [Standards for Reporting of Diagnostic Accuracy], DECIDE-AI [Early-Stage Clinical Evaluation of Decision Support Systems Driven by Artificial Intelligence], and TRIPOD [Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis]), 16 , 17 , 18 , 19 the reporting quality in current publications still needs to be improved, with approximately 80% of articles providing incomplete information and less than 50% of contents following the TRIPOD guideline, limiting clinical potential and contributing to research waste. 20 , 21 , 22 Even after 3 decades of neuroimaging-based AI model developments, it remains unclear whether the reports of such developments are unbiased and complete enough to warrant translation into clinical practice for psychiatric diagnosis.…”
Section: Introductionmentioning
confidence: 99%
“…While the aforementioned bias detection programs have merit, solving the problem of surgical bias will require a more comprehensive approach. That approach begins with a set of guidelines that set forth standards on how to conduct AI-related research and how to report it in the professional literature, including The Standard Protocol Items: Recommendations for Interventional Trials-Artificial Intelligence (SPIRIT-AI)extension, a set of guidelines designed to help researchers develop AI-related clinical trials 33 , and the Consolidated Standards of Reporting Trials-Artificial Intelligence (CONSORT-AI) extension 34 Unfortunately, despite the recommendations from thought leaders regarding the importance of adhering to standards that would make algorithms more equitable, Lu et al have found these guidelines are often ignored 35 .…”
Section: Devising a Comprehensive Bias Detection Toolkitmentioning
confidence: 99%
“…Given widespread use of the DTI to assist in care delivery, it is important to measure its performance across as many health care settings as possible . Notably, the optimal DTI score at which to trigger clinical interventions remains unclear, and the vendor does not report performance across subgroups of race, ethnicity, age, or sex . This study aimed to (1) describe the overall and lead time performance of the DTI among a large cohort of patients across 8 heterogenous Midwestern US hospitals, (2) evaluate performance measures at suggested thresholds to support clinical decision-making, and (3) assess bias in predictions among demographic subgroups.…”
Section: Introductionmentioning
confidence: 99%