Background Deep learning offers considerable promise for medical diagnostics. We aimed to evaluate the diagnostic accuracy of deep learning algorithms versus health-care professionals in classifying diseases using medical imaging.Methods In this systematic review and meta-analysis, we searched Ovid-MEDLINE, Embase, Science Citation Index, and Conference Proceedings Citation Index for studies published from Jan 1, 2012, to June 6, 2019. Studies comparing the diagnostic performance of deep learning models and health-care professionals based on medical imaging, for any disease, were included. We excluded studies that used medical waveform data graphics material or investigated the accuracy of image segmentation rather than disease classification. We extracted binary diagnostic accuracy data and constructed contingency tables to derive the outcomes of interest: sensitivity and specificity. Studies undertaking an out-of-sample external validation were included in a meta-analysis, using a unified hierarchical model. This study is registered with PROSPERO, CRD42018091176.Findings Our search identified 31 587 studies, of which 82 (describing 147 patient cohorts) were included. 69 studies provided enough data to construct contingency tables, enabling calculation of test accuracy, with sensitivity ranging from 9•7% to 100•0% (mean 79•1%, SD 0•2) and specificity ranging from 38•9% to 100•0% (mean 88•3%, SD 0•1). An out-of-sample external validation was done in 25 studies, of which 14 made the comparison between deep learning models and health-care professionals in the same sample. Comparison of the performance between health-care professionals in these 14 studies, when restricting the analysis to the contingency table for each study reporting the highest accuracy, found a pooled sensitivity of 87•0% (95% CI 83•0-90•2) for deep learning models and 86•4% (79•9-91•0) for health-care professionals, and a pooled specificity of 92•5% (95% CI 85•1-96•4) for deep learning models and 90•5% (80•6-95•7) for health-care professionals.Interpretation Our review found the diagnostic performance of deep learning models to be equivalent to that of health-care professionals. However, a major finding of the review is that few studies presented externally validated results or compared the performance of deep learning models and health-care professionals using the same sample. Additionally, poor reporting is prevalent in deep learning studies, which limits reliable interpretation of the reported diagnostic accuracy. New reporting standards that address specific challenges of deep learning could improve future studies, enabling greater confidence in the results of future evaluations of this promising technology.
Background Deep learning has the potential to transform health care; however, substantial expertise is required to train such models. We sought to evaluate the utility of automated deep learning software to develop medical image diagnostic classifiers by health-care professionals with no coding-and no deep learning-expertise. MethodsWe used five publicly available open-source datasets: retinal fundus images (MESSIDOR); optical coherence tomography (OCT) images (Guangzhou Medical University and Shiley Eye Institute, version 3); images of skin lesions (Human Against Machine [HAM] 10000), and both paediatric and adult chest x-ray (CXR) images (Guangzhou Medical University and Shiley Eye Institute, version 3 and the National Institute of Health [NIH] dataset, respectively)to separately feed into a neural architecture search framework, hosted through Google Cloud AutoML, that automatically developed a deep learning architecture to classify common diseases. Sensitivity (recall), specificity, and positive predictive value (precision) were used to evaluate the diagnostic properties of the models. The discriminative performance was assessed using the area under the precision recall curve (AUPRC). In the case of the deep learning model developed on a subset of the HAM10000 dataset, we did external validation using the Edinburgh Dermofit Library dataset.Findings Diagnostic properties and discriminative performance from internal validations were high in the binary classification tasks (sensitivity 73•3-97•0%; specificity 67-100%; AUPRC 0•87-1•00). In the multiple classification tasks, the diagnostic properties ranged from 38% to 100% for sensitivity and from 67% to 100% for specificity. The discriminative performance in terms of AUPRC ranged from 0•57 to 1•00 in the five automated deep learning models. In an external validation using the Edinburgh Dermofit Library dataset, the automated deep learning model showed an AUPRC of 0•47, with a sensitivity of 49% and a positive predictive value of 52%.Interpretation All models, except the automated deep learning model trained on the multilabel classification task of the NIH CXR14 dataset, showed comparable discriminative performance and diagnostic properties to state-of-the-art performing deep learning algorithms. The performance in the external validation study was low. The quality of the open-access datasets (including insufficient information about patient flow and demographics) and the absence of measurement for precision, such as confidence intervals, constituted the major limitations of this study. The availability of automated deep learning platforms provide an opportunity for the medical community to enhance their understanding in model development and evaluation. Although the derivation of classification models without requiring a deep understanding of the mathematical, statistical, and programming principles is attractive, comparable performance to expertly designed models is limited to more elementary classification tasks. Furthermore, care should be placed in adhering t...
In recent years, there has been considerable interest in the prospect of machine learning models demonstrating expert-level diagnosis in multiple disease contexts. However, there is concern that the excitement around this field may be associated with inadequate scrutiny of methodology and insufficient adoption of scientific good practice in the studies involving artificial intelligence in health care. This article aims to empower clinicians and researchers to critically appraise studies of clinical applications of machine learning, through: (1) introducing basic machine learning concepts and nomenclature; (2) outlining key applicable principles of evidence-based medicine; and (3) highlighting some of the potential pitfalls in the design and reporting of these studies.
Insights into systemic disease through retinal imaging-based oculomics. Trans Vis Sci Tech. 2020;9(2):6, https://doi. org/10.1167/tvst.9.2.6 Among the most noteworthy developments in ophthalmology over the last decade has been the emergence of quantifiable high-resolution imaging modalities, which are typically non-invasive, rapid and widely available. Such imaging is of unquestionable utility in the assessment of ocular disease however evidence is also mounting for its role in identifying ocular biomarkers of systemic disease, which we term oculomics. In this review, we highlight our current understanding of how retinal morphology evolves in two leading causes of global morbidity and mortality, cardiovascular disease and dementia. Population-based analyses have demonstrated the predictive value of retinal microvascular indices, as measured through fundus photography, in screening for heart attack and stroke. Similarly, the association between the structure of the neurosensory retina and prevalent neurodegenerative disease, in particular Alzheimer's disease, is now well-established. Given the growing size and complexity of emerging multimodal datasets, modern artificial intelligence techniques, such as deep learning, may provide the optimal opportunity to further characterize these associations, enhance our understanding of eye-body relationships and secure novel scalable approaches to the risk stratification of chronic complex disorders of ageing.
Keratin 9 (K9) is a type I intermediate filament protein whose expression is confined to the suprabasal layers of the palmoplantar epidermis. Although mutations in the K9 gene are known to cause epidermolytic palmoplantar keratoderma, a rare dominant-negative skin disorder, its functional significance is poorly understood. To gain insight into the physical requirement and importance of K9, we generated K9-deficient (Krt9−/−) mice. Here, we report that adult Krt9−/−mice develop calluses marked by hyperpigmentation that are exclusively localized to the stress-bearing footpads. Histological, immunohistochemical, and immunoblot analyses of these regions revealed hyperproliferation, impaired terminal differentiation, and abnormal expression of keratins K5, K14, and K2. Furthermore, the absence of K9 induces the stress-activated keratins K6 and K16. Importantly, mice heterozygous for the K9-null allele (Krt9+/−) show neither an overt nor histological phenotype, demonstrating that one Krt9 allele is sufficient for the developing normal palmoplantar epidermis. Together, our data demonstrate that complete ablation of K9 is not tolerable in vivo and that K9 is required for terminal differentiation and maintaining the mechanical integrity of palmoplantar epidermis.
BackgroundHospital Eye Services (HES) in the UK face an increasing number of optometric referrals driven by progress in retinal imaging. The National Health Service (NHS) published a 10-year strategy (NHS Long-Term Plan) to transform services to meet this challenge. In this study, we implemented a cloud-based referral platform to improve communication between optometrists and ophthalmologists.MethodsRetrospective cohort study conducted at Moorfields Eye Hospital, Croydon (NHS Foundation Trust, London, UK). Patients classified into the HES referral pathway by contributing optometrists have been included into this study. Main outcome measures was the reduction of unnecessary referrals.ResultsAfter reviewing the patient’s data in a web-based interface 54 (52%) out of 103 attending patients initially classified into the referral pathway did not need a specialist referral. Fourteen (14%) patients needing urgent treatment were identified. Usability was measured in duration for data input and reviewing which was an average of 9.2 min (median: 5.4; IQR: 3.4–8.7) for optometrists and 3.0 min (median: 3.0; IQR: 1.7–3.9) min for ophthalmologists. A variety of diagnosis was covered by this tool with dry age-related macular degeneration (n=34) being most common.ConclusionAfter implementation more than half of the HES referrals have been avoided. This platform offers a digital-first solution that enables rapid-access eye care for patients in community optometrists, facilitates communication between healthcare providers and may serve as a foundation for implementation of artificial intelligence.
Purpose: To apply a deep learning algorithm for automated, objective, and comprehensive quantification of OCT scans to a large real-world dataset of eyes with neovascular age-related macular degeneration (AMD) and make the raw segmentation output data openly available for further research.Design: Retrospective analysis of OCT images from the Moorfields Eye Hospital AMD Database.Participants: A total of 2473 first-treated eyes and 493 second-treated eyes that commenced therapy for neovascular AMD between June 2012 and June 2017.Methods: A deep learning algorithm was used to segment all baseline OCT scans. Volumes were calculated for segmented features such as neurosensory retina (NSR), drusen, intraretinal fluid (IRF), subretinal fluid (SRF), subretinal hyperreflective material (SHRM), retinal pigment epithelium (RPE), hyperreflective foci (HRF), fibrovascular pigment epithelium detachment (fvPED), and serous PED (sPED). Analyses included comparisons between firstand second-treated eyes by visual acuity (VA) and race/ethnicity and correlations between volumes.Main Outcome Measures: Volumes of segmented features (mm 3 ) and central subfield thickness (CST) (mm).Results: In first-treated eyes, the majority had both IRF and SRF (54.7%). First-treated eyes had greater volumes for all segmented tissues, with the exception of drusen, which was greater in second-treated eyes. In first-treated eyes, older age was associated with lower volumes for RPE, SRF, NSR, and sPED; in second-treated eyes, older age was associated with lower volumes of NSR, RPE, sPED, fvPED, and SRF. Eyes from Black individuals had higher SRF, RPE, and serous PED volumes compared with other ethnic groups. Greater volumes of the majority of features were associated with worse VA.Conclusions: We report the results of large-scale automated quantification of a novel range of baseline features in neovascular AMD. Major differences between firstand second-treated eyes, with increasing age, and between ethnicities are highlighted. In the coming years, enhanced, automated OCT segmentation may assist personalization of real-world care and the detection of novel structureefunction correlations. These data will be made publicly available for replication and future investigation by the AMD research community.
IMPORTANCETelemedicine is accelerating the remote detection and monitoring of medical conditions, such as vision-threatening diseases. Meaningful deployment of smartphone apps for home vision monitoring should consider the barriers to patient uptake and engagement and address issues around digital exclusion in vulnerable patient populations.OBJECTIVE To quantify the associations between patient characteristics and clinical measures with vision monitoring app uptake and engagement. DESIGN, SETTING, AND PARTICIPANTSIn this cohort and survey study, consecutive adult patients attending Moorfields Eye Hospital receiving intravitreal injections for retinal disease between May 2020 and February 2021 were included.EXPOSURES Patients were offered the Home Vision Monitor (HVM) smartphone app to self-test their vision. A patient survey was conducted to capture their experience. App data, demographic characteristics, survey results, and clinical data from the electronic health record were analyzed via regression and machine learning.MAIN OUTCOMES AND MEASURES Associations of patient uptake, compliance, and use rate measured in odds ratios (ORs). RESULTSOf 417 included patients, 236 (56.6%) were female, and the mean (SD) age was 72.8 (12.8) years. A total of 258 patients (61.9%) were active users. Uptake was negatively associated with age (OR, 0.98; 95% CI, 0.97-0.998; P = .02) and positively associated with both visual acuity in the better-seeing eye (OR, 1.02; 95% CI, 1.00-1.03; P = .01) and baseline number of intravitreal injections (OR, 1.01; 95% CI, 1.00-1.02; P = .02). Of 258 active patients, 166 (64.3%) fulfilled the definition of compliance. Compliance was associated with patients diagnosed with neovascular age-related macular degeneration (OR, 1.94; 95% CI, 1.07-3.53; P = .002), White British ethnicity (OR, 1.69; 95% CI, 0.96-3.01; P = .02), and visual acuity in the better-seeing eye at baseline (OR, 1.02; 95% CI, 1.01-1.04; P = .04). Use rate was higher with increasing levels of comfort with use of modern technologies (β = 0.031; 95% CI, 0.007-0.055; P = .02). A total of 119 patients (98.4%) found the app either easy or very easy to use, while 96 (82.1%) experienced increased reassurance from using the app.CONCLUSIONS AND RELEVANCE This evaluation of home vision monitoring for patients with common vision-threatening disease within a clinical practice setting revealed demographic, clinical, and patient-related factors associated with patient uptake and engagement. These insights inform targeted interventions to address risks of digital exclusion with smartphone-based medical devices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.