The Shapley value has become popular in the Explainable AI (XAI) literature, thanks, to a large extent, to a solid theoretical foundation, including four "favourable and fair" axioms for attribution in transferable utility games. The Shapley value is provably the only solution concept satisfying these axioms. In this paper, we introduce the Shapley value and draw attention to its recent uses as a feature selection tool. We call into question this use of the Shapley value, using simple, abstract "toy" counterexamples to illustrate that the axioms may work against the goals of feature selection. From this, we develop a number of insights that are then investigated in concrete simulation settings, with a variety of Shapley value formulations, including SHapley Additive exPlanations (SHAP) and Shapley Additive Global importancE (SAGE). The aim is not to encourage any use of the Shapley value for feature selection, but we aim to clarify various limitations around their current use in the literature. In so doing, we hope to help demystify certain aspects of the Shapley value axioms that are viewed as "favourable". In particular, we wish to highlight that the favourability of the axioms depends non-trivially on the way in which the Shapley value is appropriated in the XAI application.
The International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) COVID-19 dataset is one of the largest international databases of prospectively collected clinical data on people hospitalized with COVID-19. This dataset was compiled during the COVID-19 pandemic by a network of hospitals that collect data using the ISARIC-World Health Organization Clinical Characterization Protocol and data tools. The database includes data from more than 705,000 patients, collected in more than 60 countries and 1,500 centres worldwide. Patient data are available from acute hospital admissions with COVID-19 and outpatient follow-ups. The data include signs and symptoms, pre-existing comorbidities, vital signs, chronic and acute treatments, complications, dates of hospitalization and discharge, mortality, viral strains, vaccination status, and other data. Here, we present the dataset characteristics, explain its architecture and how to gain access, and provide tools to facilitate its use.
The analysis of spatial observations on a sphere is important in areas such as geosciences, physics and embryo research, just to name a few. The purpose of the package rcosmo is to conduct efficient information processing, visualisation, manipulation and spatial statistical analysis of Cosmic Microwave Background (CMB) radiation and other spherical data. The package was developed for spherical data stored in the Hierarchical Equal Area isoLatitude Pixelation (Healpix) representation. rcosmo has more than 100 different functions. Most of them initially were developed for CMB, but also can be used for other spherical data as rcosmo contains tools for transforming spherical data in cartesian and geographic coordinates into the HEALPix representation. We give a general description of the package and illustrate some important functionalities and benchmarks.
Background Acute kidney injury (AKI) is one of the most common and significant problems in patients with Coronavirus Disease 2019 (COVID-19). However, little is known about the incidence and impact of AKI occurring in the community or early in the hospital admission. The traditional Kidney Disease Improving Global Outcomes (KDIGO) definition can fail to identify patients for whom hospitalisation coincides with recovery of AKI as manifested by a decrease in serum creatinine (sCr). We hypothesised that an extended KDIGO (eKDIGO) definition, adapted from the International Society of Nephrology (ISN) 0by25 studies, would identify more cases of AKI in patients with COVID-19 and that these may correspond to community-acquired AKI (CA-AKI) with similarly poor outcomes as previously reported in this population. Methods and findings All individuals recruited using the International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC)–World Health Organization (WHO) Clinical Characterisation Protocol (CCP) and admitted to 1,609 hospitals in 54 countries with Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection from February 15, 2020 to February 1, 2021 were included in the study. Data were collected and analysed for the duration of a patient’s admission. Incidence, staging, and timing of AKI were evaluated using a traditional and eKDIGO definition, which incorporated a commensurate decrease in sCr. Patients within eKDIGO diagnosed with AKI by a decrease in sCr were labelled as deKDIGO. Clinical characteristics and outcomes—intensive care unit (ICU) admission, invasive mechanical ventilation, and in-hospital death—were compared for all 3 groups of patients. The relationship between eKDIGO AKI and in-hospital death was assessed using survival curves and logistic regression, adjusting for disease severity and AKI susceptibility. A total of 75,670 patients were included in the final analysis cohort. Median length of admission was 12 days (interquartile range [IQR] 7, 20). There were twice as many patients with AKI identified by eKDIGO than KDIGO (31.7% versus 16.8%). Those in the eKDIGO group had a greater proportion of stage 1 AKI (58% versus 36% in KDIGO patients). Peak AKI occurred early in the admission more frequently among eKDIGO than KDIGO patients. Compared to those without AKI, patients in the eKDIGO group had worse renal function on admission, more in-hospital complications, higher rates of ICU admission (54% versus 23%) invasive ventilation (45% versus 15%), and increased mortality (38% versus 19%). Patients in the eKDIGO group had a higher risk of in-hospital death than those without AKI (adjusted odds ratio: 1.78, 95% confidence interval: 1.71 to 1.80, p-value < 0.001). Mortality and rate of ICU admission were lower among deKDIGO than KDIGO patients (25% versus 50% death and 35% versus 70% ICU admission) but significantly higher when compared to patients with no AKI (25% versus 19% death and 35% versus 23% ICU admission) (all p-values <5 × 10−5). Limitations include ad hoc sCr sampling, exclusion of patients with less than 2 sCr measurements, and limited availability of sCr measurements prior to initiation of acute dialysis. Conclusions An extended KDIGO definition of AKI resulted in a significantly higher detection rate in this population. These additional cases of AKI occurred early in the hospital admission and were associated with worse outcomes compared to patients without AKI.
The Shapley value has become popular in the Explainable AI (XAI) literature, thanks, to a large extent, to a solid theoretical foundation, including four "favourable and fair" axioms for attribution in transferable utility games. The Shapley value is provably the only solution concept satisfying these axioms. In this paper, we introduce the Shapley value and draw attention to its recent uses as a feature selection tool. We call into question this use of the Shapley value, using simple, abstract 'toy' counterexamples to illustrate that the axioms may work against the goals of feature selection. From this, we develop a number of insights that are then investigated in concrete simulation settings, with a variety of Shapley value formulations, including SHapley Additive exPlanations (SHAP) and Shapley Additive Global importancE (SAGE).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.