In 1985, development of a computer system called "Deep Thought" began at Carnegie Mellon University with the lofty objective of developing an autonomous system capable of outperforming the world's top chess grandmasters. Later renamed "Deep Blue, " this chess-playing expert system defeated world champion Gary Kasparov in 1997 in a six-game match. However, it was not until 2017 that a deep artificial neural network algorithm known as "AlphaZero" achieved super-human performance in several challenging games, including Chess, Shogi, and Go (1). Such triumphs in computer-based technologies are common today as artificial intelligence (AI) applications, such as ChatGPT and DALL-E, are mimicking human capabilities, even passing medical board examinations (2). The term AI is used to describe the general ability of computers to emulate various characteristics of human intelligence, including pattern recognition, inference, and sequential decision-making, among others. Machine learning (ML) is a subset of AI that can learn the complex interactions or temporal relationships among multivariate risk factors without the need to hand-craft such features via expert knowledge (3). Retrospective studies have demonstrated ML applications are particularly useful for their diagnostic and prognostic capabilities leveraging vast quantity of data available in the ICU (4, 5). Certain ML algorithms have approached human performance at narrow tasks such as predicting resuscitation strategies in sepsis (6), need for mechanical ventilation (7), mortality in critically ill patients (8), and ICU length of stay (9).Sepsis is an attractive target for ML approaches as it is an inherently complex, common, costly, and deadly condition. Prediction of sepsis is the most common ML application described, although recent advances include approaches to optimize therapeutics and resuscitation strategies (6, 10). Given the potential to improve patient-centered outcomes and excitement about newer analytic approaches, it is no surprise that the number of ML algorithms aimed to improve sepsis care is increasing at a rapid rate. However, errors in sepsis prediction are often highlighted both in anecdotal and health system-wide failures that can be traced to poor implementation approaches, rudimentary ML algorithms, application of algorithms outside their intended use, or without proper maintenance. Noting these criticisms, what can be done at this point to demonstrate value of these predictive models? We believe that a revised focus on data enrichment, proper implementation, and rigorous testing is required to bring the promise of AI to the ICU.