Assessment of a deep-learning system for fracture detection in musculoskeletal radiographs

Jones, Rebecca M.; Sharma, Anuj Kumar; Hotchkiss, Robert; Sperling, John W.; Hamburger, Jackson; Ledig, Christian; O’Toole, Robert V.; Gardner, Michael J.; Venkatesh, Srivas; Roberts, Matthew M.; Sauvestre, Romain; Shatkhin, Max; Gupta, Anant; Chopra, Sumit; Kumaravel, Manickam; Daluiski, Aaron; Plogger, Will; Nascone, Jason W.; Potter, Hollis G.; Lindsey, Robert

doi:10.1038/s41746-020-00352-w

Cited by 80 publications

(39 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…including the upper and lower extremities and spine (13), but the clinicians who read the radiographs with AI and without AI assistance were emergency medicine physicians and physician assistants only, with senior orthopedic surgeons providing the ground truth; no radiologist was involved in the radiographic interpretation. Another recent study analyzed fractures in 16 anatomic locations; however, readers of the radiographs were radiologists and orthopedic surgeons only (14).…”

Section: Ground Truth Definitionmentioning

confidence: 99%

Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence

Guermazi¹,

Tannoury²,

Kompel

et al. 2022

Radiology

101

View full text Add to dashboard Cite

MUSCULOSKELETAL IMAGINGF racture detection using radiography is one of the most common tasks in patients with high-or low-energy trauma in various clinical settings, including the emergency department, urgent care, and outpatient clinics such as orthopedics, rheumatology, and family medicine. Missed fractures on radiographs are one of the most common causes of diagnostic discrepancies between initial interpretations by nonradiologists or radiology residents and the final read by board-certified radiologists, leading to preventable harm or delay in care to the patient (1-3). Fracture interpretation errors can represent up to 24% of harmful diagnostic errors seen in the emergency department (2). Furthermore, inconsistencies in radiographic diagnosis of fractures are more common during the evening and overnight hours (5 pm to 3 am), likely related to nonexpert reading and fatigue (3). In patients with multiple traumas, the proportion of missed injuries, including fractures, can be high on the forearm and hands (6.6%) and feet (6.5%) (4,5).To date, several studies about artificial intelligence (AI) aid to fracture detection have been performed focusing only on certain body parts, such as hand, wrist, and forearm (6-9); hip and pelvis (10,11); knees (9); and spine (12). One study evaluated fractures in 11 body locations, Background: Missed fractures are a common cause of diagnostic discrepancy between initial radiographic interpretation and the final read by board-certified radiologists.Purpose: To assess the effect of assistance by artificial intelligence (AI) on diagnostic performances of physicians for fractures on radiographs. Materials and Methods:This retrospective diagnostic study used the multi-reader, multi-case methodology based on an external multicenter data set of 480 examinations with at least 60 examinations per body region (foot and ankle, knee and leg, hip and pelvis, hand and wrist, elbow and arm, shoulder and clavicle, rib cage, and thoracolumbar spine) between July 2020 and January 2021. Fracture prevalence was set at 50%. The ground truth was determined by two musculoskeletal radiologists, with discrepancies solved by a third. Twenty-four readers (radiologists, orthopedists, emergency physicians, physician assistants, rheumatologists, family physicians) were presented the whole validation data set (n = 480), with and without AI assistance, with a 1-month minimum washout period. The primary analysis had to demonstrate superiority of sensitivity per patient and the noninferiority of specificity per patient at 23% margin with AI aid. Stand-alone AI performance was also assessed using receiver operating characteristic curves.Results: A total of 480 patients were included (mean age, 59 years 6 16 [standard deviation]; 327 women). The sensitivity per patient was 10.4% higher (95% CI: 6.9, 13.9; P , .001 for superiority) with AI aid (4331 of 5760 readings, 75.2%) than without AI (3732 of 5760 readings, 64.8%). The specificity per patient with AI aid (5504 of 5760 readings, 95.6%) was noninferior to that ...

show abstract

Section: Ground Truth Definitionmentioning

confidence: 99%

Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence

Guermazi¹,

Tannoury²,

Kompel

et al. 2022

Radiology

101

View full text Add to dashboard Cite

show abstract

“…In particular, virtual fracture clinic review [25] and out-of-hours teleradiology services [26] have been widely adopted across the UK and Europe. Alongside these existing methods, the development of novel technologies (such as artificial intelligence algorithms [27]) to supplement interpretation is evidence of a broadly accepted clinical need to improve this reporting.…”

Section: Key Findingsmentioning

confidence: 99%

Reporting errors in plain radiographs for lower limb trauma—a systematic review and meta-analysis

York

Franklin²,

Reynolds³

et al. 2021

Skeletal Radiol

View full text Add to dashboard Cite

Introduction Plain radiographs are a globally ubiquitous means of investigation for injuries to the musculoskeletal system. Despite this, initial interpretation remains a challenge and inaccuracies give rise to adverse sequelae for patients and healthcare providers alike. This study sought to address the limited, existing meta-analytic research on the initial reporting of radiographs for skeletal trauma, with specific regard to diagnostic accuracy of the most commonly injured region of the appendicular skeleton, the lower limb. Method A prospectively registered, systematic review and meta-analysis was performed using published research from the major clinical-science databases. Studies identified as appropriate for inclusion underwent methodological quality and risk of bias analysis. Meta-analysis was then performed to establish summary rates for specificity and sensitivity of diagnostic accuracy, including covariates by anatomical site, using HSROC and bivariate models. Results A total of 3887 articles were screened, with 10 identified as suitable for analysis based on the eligibility criteria. Sensitivity and specificity across the studies were 93.5% and 89.7% respectively. Compared with other anatomical subdivisions, interpretation of ankle radiographs yielded the highest sensitivity and specificity, with values of 98.1% and 94.6% respectively, and a diagnostic odds ratio of 929.97. Conclusion Interpretation of lower limb skeletal radiographs operates at a reasonably high degree of sensitivity and specificity. However, one in twenty true positives is missed on initial radiographic interpretation and safety netting systems need to be established to address this. Virtual fracture clinic reviews and teleradiology services in conjunction with novel technology will likely be crucial in these circumstances.

show abstract

“…Recently, deep learning algorithms, especially convolutional neural network (CNN) architectures, have been widely recognized as an outperforming and reliable approach to identify clinically useful features directly from the medical images. ( 18 ) Previous studies that used CNNs showed promising results in radiology, ( 19 ) pathology, ( 20 ) ophthalmology, ( 21 ) surgery, ( 22 ) and laboratory medicine. ( 23 ) With the continuous improvement of the CNN architecture and the rapid increase in hardware computing power, CNNs have achieved human‐level recognition performance.…”

Section: Introductionmentioning

confidence: 99%

Opportunistic Osteoporosis Screening Using Chest Radiographs With Deep Learning: Development and External Validation With a Cohort Dataset

Jang

Kim

Bae

et al. 2020

Journal of Bone and Mineral Research

View full text Add to dashboard Cite

Osteoporosis is a common, but silent disease until it is complicated by fractures that are associated with morbidity and mortality. Over the past few years, although deep learning‐based disease diagnosis on chest radiographs has yielded promising results, osteoporosis screening remains unexplored. Paired data with 13,026 chest radiographs and dual‐energy X‐ray absorptiometry (DXA) results from the Health Screening and Promotion Center of Asan Medical Center, between 2012 and 2019, were used as the primary dataset in this study. For the external test, we additionally used the Asan osteoporosis cohort dataset (1089 chest radiographs, 2010 and 2017). Using a well‐performed deep learning model, we trained the OsPor‐screen model with labels defined by DXA based diagnosis of osteoporosis (lumbar spine, femoral neck, or total hip T‐score ≤ −2.5) in a supervised learning manner. The OsPor‐screen model was assessed in the internal and external test sets. We performed substudies for evaluating the effect of various anatomical subregions and image sizes of input images. OsPor‐screen model performances including sensitivity, specificity, and area under the curve (AUC) were measured in the internal and external test sets. In addition, visual explanations of the model to predict each class were expressed in gradient‐weighted class activation maps (Grad‐CAMs). The OsPor‐screen model showed promising performances. Osteoporosis screening with the OsPor‐screen model achieved an AUC of 0.91 (95% confidence interval [CI], 0.90–0.92) and an AUC of 0.88 (95% CI, 0.85–0.90) in the internal and external test set, respectively. Even though the medical relevance of these average Grad‐CAMs is unclear, these results suggest that a deep learning‐based model using chest radiographs could have the potential to be used for opportunistic automated screening of patients with osteoporosis in clinical settings. © 2021 American Society for Bone and Mineral Research (ASBMR).

show abstract

Assessment of a deep-learning system for fracture detection in musculoskeletal radiographs

Cited by 80 publications

References 25 publications

Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence

Improving Radiographic Fracture Recognition Performance and Efficiency Using Artificial Intelligence

Reporting errors in plain radiographs for lower limb trauma—a systematic review and meta-analysis

Opportunistic Osteoporosis Screening Using Chest Radiographs With Deep Learning: Development and External Validation With a Cohort Dataset

Contact Info

Product

Resources

About