MUSCULOSKELETAL IMAGINGF racture detection using radiography is one of the most common tasks in patients with high-or low-energy trauma in various clinical settings, including the emergency department, urgent care, and outpatient clinics such as orthopedics, rheumatology, and family medicine. Missed fractures on radiographs are one of the most common causes of diagnostic discrepancies between initial interpretations by nonradiologists or radiology residents and the final read by board-certified radiologists, leading to preventable harm or delay in care to the patient (1-3). Fracture interpretation errors can represent up to 24% of harmful diagnostic errors seen in the emergency department (2). Furthermore, inconsistencies in radiographic diagnosis of fractures are more common during the evening and overnight hours (5 pm to 3 am), likely related to nonexpert reading and fatigue (3). In patients with multiple traumas, the proportion of missed injuries, including fractures, can be high on the forearm and hands (6.6%) and feet (6.5%) (4,5).To date, several studies about artificial intelligence (AI) aid to fracture detection have been performed focusing only on certain body parts, such as hand, wrist, and forearm (6-9); hip and pelvis (10,11); knees (9); and spine (12). One study evaluated fractures in 11 body locations, Background: Missed fractures are a common cause of diagnostic discrepancy between initial radiographic interpretation and the final read by board-certified radiologists.Purpose: To assess the effect of assistance by artificial intelligence (AI) on diagnostic performances of physicians for fractures on radiographs.
Materials and Methods:This retrospective diagnostic study used the multi-reader, multi-case methodology based on an external multicenter data set of 480 examinations with at least 60 examinations per body region (foot and ankle, knee and leg, hip and pelvis, hand and wrist, elbow and arm, shoulder and clavicle, rib cage, and thoracolumbar spine) between July 2020 and January 2021. Fracture prevalence was set at 50%. The ground truth was determined by two musculoskeletal radiologists, with discrepancies solved by a third. Twenty-four readers (radiologists, orthopedists, emergency physicians, physician assistants, rheumatologists, family physicians) were presented the whole validation data set (n = 480), with and without AI assistance, with a 1-month minimum washout period. The primary analysis had to demonstrate superiority of sensitivity per patient and the noninferiority of specificity per patient at 23% margin with AI aid. Stand-alone AI performance was also assessed using receiver operating characteristic curves.Results: A total of 480 patients were included (mean age, 59 years 6 16 [standard deviation]; 327 women). The sensitivity per patient was 10.4% higher (95% CI: 6.9, 13.9; P , .001 for superiority) with AI aid (4331 of 5760 readings, 75.2%) than without AI (3732 of 5760 readings, 64.8%). The specificity per patient with AI aid (5504 of 5760 readings, 95.6%) was noninferior to that ...