2021
DOI: 10.21203/rs.3.rs-402058/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MIMIC-IF: Interpretability and Fairness Evaluation of Deep Learning Models on MIMIC-IV Dataset

Abstract: The recent release of large-scale healthcare datasets has greatly propelled the research of data-driven deep learning models for healthcare applications. However, due to the nature of such deep black-boxed models, concerns about interpretability, fairness, and biases in healthcare scenarios where human lives are at stake call for a careful and thorough examination of both datasets and models. In this work, we focus on MIMIC-IV (Medical Information Mart for Intensive Care, version IV), the largest publicly avai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 43 publications
(36 reference statements)
0
5
0
Order By: Relevance
“…An uncertainty measure is needed that is based on the predictions and noise distribution but also integrating the uncertainty propagation of the DL model prediction 78,79 (like the inverse of the Fisher information matrix used in the CRLB definition 74 ). Despite flourishing literature, 80,81 addressing uncertainty estimation as a complementary tool for DL interpretability, a full‐scale analysis of the robustness and reliability of such models is still challenging 82–84 . First attempts to extend these concepts in DL for MRS quantification are just subject of recent investigations 76,77 but far from general acceptance.…”
Section: Discussionmentioning
confidence: 99%
“…An uncertainty measure is needed that is based on the predictions and noise distribution but also integrating the uncertainty propagation of the DL model prediction 78,79 (like the inverse of the Fisher information matrix used in the CRLB definition 74 ). Despite flourishing literature, 80,81 addressing uncertainty estimation as a complementary tool for DL interpretability, a full‐scale analysis of the robustness and reliability of such models is still challenging 82–84 . First attempts to extend these concepts in DL for MRS quantification are just subject of recent investigations 76,77 but far from general acceptance.…”
Section: Discussionmentioning
confidence: 99%
“…On a different note, EHR systems store multi-modal, heterogeneous patient data, such as demographics, diagnoses, and clinical records, and have been used for various tasks such as medical concept extraction, mortality prediction, and disease inference. Regarding EHR data fairness, Meng et al [73] identify race-level differences in the predictions of neural network models on the MIMIC-IV dataset [57], with Black and Hispanic patients being less likely to receive interventions or receiving interventions of shorter average duration. Similarly, Röösli et al [85] reveal a strong class imbalance problem and significant fairness concerns for Black and publicly insured ICU patients in the same dataset.…”
Section: Related Workmentioning
confidence: 99%
“…First, the neural network-based approaches have been found to show discrimination towards certain demographic groups. For example, one related study tests and finds race-level differences in the predictions of multiple neural network models on MIMIC-IV [68]. It is found that the Black and Hispanic groups are less likely to receive the intervention, along with a shorter average duration of intervention.…”
Section: Structured Electronic Health Records Datamentioning
confidence: 99%
“…Sniffing out fairness issues can tell when to trust or distrust AI [7,32,8,31], which can further be combined with the domain knowledge of human experts to complement the model [98]. XAI can be used as a tool to expose biases in machine learning models via revealing the relationship between model explainability and prediction fairness [68]. The authors use several XAI methods to analyze the feature importance of the trained models.…”
Section: Decision Explanationmentioning
confidence: 99%