Advances in natural language processing (NLP) and unsupervised learning have recently enabled the long-expected synergy between true "big data" and analytics based on machine learning (ML). The ability to reliably generate structured data from unstructured electronic health records (EHRs) such as free text reports, document scans, or unlabelled medical imaging has on the one hand allowed development of algorithms based on until recently unseen amounts of data, spanning entire hospital or even national populations, and not just databases compiled by human experts. On the other hand, the capacity to primarily generate structured reports from unstructured raw EHRs has proven valuable for hospital analytics, epidemiological studies, and systematic reviews.In their narrative review, Schwartz et al. 1 report on the utilization of EHRs in spine surgery through ML techniques. The authors are to be commended for their detailed description of data types commonly found in EHRs, learning concepts to generate structured data (such as NLP and machine vision), applications of ML for prognosis and prediction, and finally the challenges inherent to using unstructured data from EHRs in medical practice and research. As the authors show, there is no question that ML is already starting to affect surgical practice relevantly in many aspects. Especially the advent of open source algorithms provided by today's tech giants have largely democratized the development of ML models, and as the authors summarize, this has led to an explosion of publications reporting such algorithms. Still, it is important to preserve the methodological quality of papers utilizing ML techniques, which is often not the case. For example, the authors touch on the issue of ensuring generalizability through robust training structures (i.e., some form of resampling) and external validation, before models are rolled out into clinical practice.We especially value that the authors discuss the problem of uninterpretable "black box" models. 2 Currently, many groups are applying complex ML algorithms to relatively small patient samples and for relatively simple tasks. While this might lead to slight benefits in model performance, these complex models (such as deep neural networks for nonimaging applications) are often typical "black box" models with a total loss of the ability to explain what factors lead the algorithm to make a certain decision. Explicability is -unfortunately -often traded in for a small and likely irrelevant increase in model accuracy. Especially in Neurospine 2019;16(4):654-656.