Abstract:Machine Learning for Health (ML4H) has demonstrated efficacy in computer imaging and other self-contained digital workflows, but has failed to substantially impact routine clinical care. This is no longer because of poor adoption of Electronic Health Records Systems (EHRS), but because ML4H needs an infrastructure for development, deployment and evaluation within the healthcare institution. In this paper, we propose a design pattern called a Clinical Deployment Environment (CDE). We sketch the five pillars of … Show more
“…There are also comments about the need for specialized professionals to participate in model construction and validation to promote better reliability (Wojtusiak, 2021;Risman, Trelles, & Denning, 2021;Harris et al, 2022;Rojas et al, 2022). Specialists can help both in processing and making sense of the data, model performance testing, and defining evaluation methods, thus ensuring that the resulting models are accurate and reliable.…”
Section: Discussionmentioning
confidence: 99%
“…Another issue pointed out by some works is the need for good model interpretability (Rafiq, Modave, Guha, & Albert, 2020;Harris et al, 2022;Li et al, 2022;Duckworth et al, 2021). ML model interpretability and explainability can help ensure that ML-enabled applications provide coherent and reliable decisions.…”
Section: Discussionmentioning
confidence: 99%
“…For example, models trained on data derived from a single health institution may not generalize well on multi-institutional scenarios. A variation on this problem is patient selection biases (regional, socioeconomic, and institutional) (Van Helvoort et al, 2020;Carolan et al, 2022;Lam et al, 2022;Birkenbihl et al, 2020;Kamran et al, 2022;Risman, Trelles, & Denning, 2021;Shickel et al, 2020;Bellocchio et al, 2021;Rafiq, Modave, Guha, & Albert, 2020;Harris et al, 2022;The RADAR-CNS Consortium et al, 2021;Li et al, 2022;Lin et al, 2022;Rojas et al, 2022;Yang, Zou, Liu, & Mulligan, 2021;Fries et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…Carolan et al (2022) describes the need for better automation technologies to improve the efficiency of algorithms. There are also opportunities for expert management and monitoring (Algorithmic Stewardship), with projections of the near-future creation of MLOps departments for healthcare services and hospitals (Harris et al, 2022). Other possibilities include integrating equity in the ML lifecycle, removing biases, as well as collecting feedback from experts and other stakeholders to bring human knowledge into the learning process (Human-in-the-Loop Learning), and going beyond statistical metrics in evaluating the model performance, using domain-oriented approaches to measure the usefulness and commercial value of these (Rojas et al, 2022;Yang, Zou, Liu, & Mulligan, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…Finally, there are opportunities for real-world applications supported by live data where teams can iteratively build and test at the bedside, continuous delivery (CD) MLOps platforms, design and oversight by people with AI security expertise, continuous assessment using randomization to avoid bias, and use of data flows with the HL7-FHIR protocol (Harris et al, 2022).…”
Machine Learning (ML) models have been applied to solve problems in various fields, which necessarily involves proper evaluation of models to ensure performance. Once deployed, ML models are subject to performance issues, such as those related to changes in data (drift). This type of issue has prompted efforts in model analysis and maintenance, as well as in continual learning, which seeks the ability to continuously learn from a (continuous) stream of data. Therefore, it's important to understand and develop methodologies that can be used to evaluate ML models, making their use in real-world environments feasible. Amongst current areas of application for ML, one that stands out, in particular, is Machine Learning for Healthcare, especially in conjunction with Software for Decision Support of Medical Applications, which presents specific challenges for the evaluation and monitoring of models, particularly given that incorrect prediction or classification can lead to life-threatening situations. This paper presents a systematic literature review that aims at identifying state-of-the-art techniques for evaluating and maintaining ML models for healthcare in effective use in the real world.
“…There are also comments about the need for specialized professionals to participate in model construction and validation to promote better reliability (Wojtusiak, 2021;Risman, Trelles, & Denning, 2021;Harris et al, 2022;Rojas et al, 2022). Specialists can help both in processing and making sense of the data, model performance testing, and defining evaluation methods, thus ensuring that the resulting models are accurate and reliable.…”
Section: Discussionmentioning
confidence: 99%
“…Another issue pointed out by some works is the need for good model interpretability (Rafiq, Modave, Guha, & Albert, 2020;Harris et al, 2022;Li et al, 2022;Duckworth et al, 2021). ML model interpretability and explainability can help ensure that ML-enabled applications provide coherent and reliable decisions.…”
Section: Discussionmentioning
confidence: 99%
“…For example, models trained on data derived from a single health institution may not generalize well on multi-institutional scenarios. A variation on this problem is patient selection biases (regional, socioeconomic, and institutional) (Van Helvoort et al, 2020;Carolan et al, 2022;Lam et al, 2022;Birkenbihl et al, 2020;Kamran et al, 2022;Risman, Trelles, & Denning, 2021;Shickel et al, 2020;Bellocchio et al, 2021;Rafiq, Modave, Guha, & Albert, 2020;Harris et al, 2022;The RADAR-CNS Consortium et al, 2021;Li et al, 2022;Lin et al, 2022;Rojas et al, 2022;Yang, Zou, Liu, & Mulligan, 2021;Fries et al, 2019).…”
Section: Discussionmentioning
confidence: 99%
“…Carolan et al (2022) describes the need for better automation technologies to improve the efficiency of algorithms. There are also opportunities for expert management and monitoring (Algorithmic Stewardship), with projections of the near-future creation of MLOps departments for healthcare services and hospitals (Harris et al, 2022). Other possibilities include integrating equity in the ML lifecycle, removing biases, as well as collecting feedback from experts and other stakeholders to bring human knowledge into the learning process (Human-in-the-Loop Learning), and going beyond statistical metrics in evaluating the model performance, using domain-oriented approaches to measure the usefulness and commercial value of these (Rojas et al, 2022;Yang, Zou, Liu, & Mulligan, 2021).…”
Section: Discussionmentioning
confidence: 99%
“…Finally, there are opportunities for real-world applications supported by live data where teams can iteratively build and test at the bedside, continuous delivery (CD) MLOps platforms, design and oversight by people with AI security expertise, continuous assessment using randomization to avoid bias, and use of data flows with the HL7-FHIR protocol (Harris et al, 2022).…”
Machine Learning (ML) models have been applied to solve problems in various fields, which necessarily involves proper evaluation of models to ensure performance. Once deployed, ML models are subject to performance issues, such as those related to changes in data (drift). This type of issue has prompted efforts in model analysis and maintenance, as well as in continual learning, which seeks the ability to continuously learn from a (continuous) stream of data. Therefore, it's important to understand and develop methodologies that can be used to evaluate ML models, making their use in real-world environments feasible. Amongst current areas of application for ML, one that stands out, in particular, is Machine Learning for Healthcare, especially in conjunction with Software for Decision Support of Medical Applications, which presents specific challenges for the evaluation and monitoring of models, particularly given that incorrect prediction or classification can lead to life-threatening situations. This paper presents a systematic literature review that aims at identifying state-of-the-art techniques for evaluating and maintaining ML models for healthcare in effective use in the real world.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.