This is a condensed summary of an international multisociety statement on ethics of artificial intelligence (AI) in radiology produced by the ACR, European Society of Radiology, RSNA, Society for Imaging Informatics in Medicine, European Society of Medical Imaging Informatics, Canadian Association of Radiologists, and American Association of Physicists in Medicine. AI has great potential to increase efficiency and accuracy throughout radiology, but it also carries inherent pitfalls and biases. Widespread use of AI-based intelligent and autonomous systems in radiology can increase the risk of systemic errors with high consequence and highlights complex ethical and societal issues. Currently, there is little experience using AI for patient care in diverse clinical settings. Extensive research is needed to understand how to best deploy AI in clinical practice. This statement highlights our consensus that ethical use of AI in radiology should promote well-being, minimize harm, and ensure that the benefits and harms are distributed among stakeholders in a just manner. We believe AI should respect human rights and freedoms, including dignity and privacy. It should be designed for maximum transparency and dependability. Ultimate responsibility and accountability for AI remains with its human designers and operators for the foreseeable future. The radiology community should start now to develop codes of ethics and practice for AI that promote any use that helps patients and the common good and should block use of radiology data and algorithms for financial gain without those two attributes.
This is a condensed summary of an international multisociety statement on ethics of artificial intelligence (AI) in radiology produced by the ACR, European Society of Radiology, RSNA, Society for Imaging Informatics in Medicine, European Society of Medical Imaging Informatics, Canadian Association of Radiologists, and American Association of Physicists in Medicine. AI has great potential to increase efficiency and accuracy throughout radiology, but also carries inherent pitfalls and biases. Widespread use of AI-based intelligent and autonomous systems in radiology can increase the risk of systemic errors with high consequence, and highlights complex ethical and societal issues. Currently, there is little experience using AI for patient care in diverse clinical settings. Extensive research is needed to understand how to best deploy AI in clinical practice. This statement highlights our consensus that ethical use of AI in radiology should promote well-being, minimize harm, and ensure that the benefits and harms are distributed among stakeholders in a just manner. We believe AI should respect human rights and freedoms, including dignity and privacy. It should be designed for maximum transparency and dependability. Ultimate responsibility and accountability for AI remains with its human designers and operators for the foreseeable future. The radiology community should start now to develop codes of ethics and practice for AI which promote any use that helps patients and the common good and should block use of radiology data and algorithms for financial gain without those two attributes.
The full version (Appendix E1 [online]) is posted on the web pages of each of these societies. Authors include society representatives, patient advocates, an American professor of philosophy, and attorneys with experience in radiology and privacy in the United States and the European Union. Artificial intelligence (AI), defined as computers that behave in ways that previously were thought to require human intelligence, has the potential to substantially improve radiology, help patients, and decrease cost (1). Radiologists are experts at acquiring information from medical images. AI can extend this expertise, extracting even more information to make better or entirely new predictions about patients. Going forward, conclusions about images will be made by human radiologists in conjunction with intelligent and autonomous machines. Although the machines will make mistakes, they are likely to make decisions more efficiently and with more consistency than humans and in some instances will contradict human radiologists and be proven to be correct. AI will affect image interpretation, report generation, result communication, and billing practice (1,2). AI has the potential to alter professional relationships, patient engagement, knowledge hierarchy, and the labor market. Additionally, AI may exacerbate the concentration and imbalance of resources, with entities that have significant AI resources having more "radiology decision-making" capabilities. Radiologists and radiology departments will also be data, categorized and evaluated by AI models. AI will infer patterns in personal, professional, and institutional behavior. The value, ownership, use of, and access to radiology data have taken on new meanings and significance in the era of AI. AI is complex and carries potential pitfalls and inherent biases. Widespread use of AI-based intelligent and autonomous machines in radiology can increase systemic risks of harm, raise the possibility of errors with high consequences, and amplify complex ethical and societal issues.
PURPOSE Accurate risk assessment is essential for the success of population screening programs in breast cancer. Models with high sensitivity and specificity would enable programs to target more elaborate screening efforts to high-risk populations, while minimizing overtreatment for the rest. Artificial intelligence (AI)-based risk models have demonstrated a significant advance over risk models used today in clinical practice. However, the responsible deployment of novel AI requires careful validation across diverse populations. To this end, we validate our AI-based model, Mirai, across globally diverse screening populations. METHODS We collected screening mammograms and pathology-confirmed breast cancer outcomes from Massachusetts General Hospital, USA; Novant, USA; Emory, USA; Maccabi-Assuta, Israel; Karolinska, Sweden; Chang Gung Memorial Hospital, Taiwan; and Barretos, Brazil. We evaluated Uno's concordance-index for Mirai in predicting risk of breast cancer at one to five years from the mammogram. RESULTS A total of 128,793 mammograms from 62,185 patients were collected across the seven sites, of which 3,815 were followed by a cancer diagnosis within 5 years. Mirai obtained concordance indices of 0.75 (95% CI, 0.72 to 0.78), 0.75 (95% CI, 0.70 to 0.80), 0.77 (95% CI, 0.75 to 0.79), 0.77 (95% CI, 0.73 to 0.81), 0.81 (95% CI, 0.79 to 0.82), 0.79 (95% CI, 0.76 to 0.83), and 0.84 (95% CI, 0.81 to 0.88) at Massachusetts General Hospital, Novant, Emory, Maccabi-Assuta, Karolinska, Chang Gung Memorial Hospital, and Barretos, respectively. CONCLUSION Mirai, a mammography-based risk model, maintained its accuracy across globally diverse test sets from seven hospitals across five countries. This is the broadest validation to date of an AI-based breast cancer model and suggests that the technology can offer broad and equitable improvements in care.
Values are embedded throughout the MLHC pipeline, from the design of models, to the execution and reporting of trials, to the regulatory approval process. Guidelines hold significant power in defining what is worthy of emphasis. While fairness is essential to the impact and consequences of MLHC tools, the concept is often conspicuously absent or ineffectually vague in emerging guidelines. The field of machine MLHC has the opportunity at this juncture to render fairness integral to the identity field. We call on the MLHC community to commit to the project of operationalising fairness, and to emphasise fairness as a requirement in practice.
Summary Background Despite wide use of severity scoring systems for case-mix determination and benchmarking in the intensive care unit (ICU), the possibility of scoring bias across ethnicities has not been examined. Guidelines on the use of illness severity scores to inform triage decisions for allocation of scarce resources, such as mechanical ventilation, during the current COVID-19 pandemic warrant examination for possible bias in these models. We investigated the performance of the severity scoring systems Acute Physiology and Chronic Health Evaluation IVa (APACHE IVa), Oxford Acute Severity of Illness Score (OASIS), and Sequential Organ Failure Assessment (SOFA) across four ethnicities in two large ICU databases to identify possible ethnicity-based bias. Methods Data from the electronic ICU Collaborative Research Database (eICU-CRD) and the Medical Information Mart for Intensive Care III (MIMIC-III) database, built from patient episodes in the USA from 2014–15 and 2001–12, respectively, were analysed for score performance in Asian, Black, Hispanic, and White people after appropriate exclusions. Hospital mortality was the outcome of interest. Discrimination and calibration were determined for all three scoring systems in all four groups, using area under receiver operating characteristic (AUROC) curve for different ethnicities to assess discrimination, and standardised mortality ratio (SMR) or proxy measures to assess calibration. Findings We analysed 166 751 participants (122 919 eICU-CRD and 43 832 MIMIC-III). Although measurements of discrimination were significantly different among the groups (AUROC ranging from 0·86 to 0·89 [p=0·016] with APACHE IVa and from 0·75 to 0·77 [p=0·85] with OASIS), they did not display any discernible systematic patterns of bias. However, measurements of calibration indicated persistent, and in some cases statistically significant, patterns of difference between Hispanic people (SMR 0·73 with APACHE IVa and 0·64 with OASIS) and Black people (0·67 and 0·68) versus Asian people (0·77 and 0·95) and White people (0·76 and 0·81). Although calibrations were imperfect for all groups, the scores consistently showed a pattern of overpredicting mortality for Black people and Hispanic people. Similar results were seen using SOFA scores across the two databases. Interpretation The systematic differences in calibration across ethnicities suggest that illness severity scores reflect statistical bias in their predictions of mortality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.