Objective To examine the accuracy of artificial intelligence (AI) for the detection of breast cancer in mammography screening practice. Design Systematic review of test accuracy studies. Data sources Medline, Embase, Web of Science, and Cochrane Database of Systematic Reviews from 1 January 2010 to 17 May 2021. Eligibility criteria Studies reporting test accuracy of AI algorithms, alone or in combination with radiologists, to detect cancer in women’s digital mammograms in screening practice, or in test sets. Reference standard was biopsy with histology or follow-up (for screen negative women). Outcomes included test accuracy and cancer type detected. Study selection and synthesis Two reviewers independently assessed articles for inclusion and assessed the methodological quality of included studies using the QUality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. A single reviewer extracted data, which were checked by a second reviewer. Narrative data synthesis was performed. Results Twelve studies totalling 131 822 screened women were included. No prospective studies measuring test accuracy of AI in screening practice were found. Studies were of poor methodological quality. Three retrospective studies compared AI systems with the clinical decisions of the original radiologist, including 79 910 women, of whom 1878 had screen detected cancer or interval cancer within 12 months of screening. Thirty four (94%) of 36 AI systems evaluated in these studies were less accurate than a single radiologist, and all were less accurate than consensus of two or more radiologists. Five smaller studies (1086 women, 520 cancers) at high risk of bias and low generalisability to the clinical context reported that all five evaluated AI systems (as standalone to replace radiologist or as a reader aid) were more accurate than a single radiologist reading a test set in the laboratory. In three studies, AI used for triage screened out 53%, 45%, and 50% of women at low risk but also 10%, 4%, and 0% of cancers detected by radiologists. Conclusions Current evidence for AI does not yet allow judgement of its accuracy in breast cancer screening programmes, and it is unclear where on the clinical pathway AI might be of most benefit. AI systems are not sufficiently specific to replace radiologist double reading in screening programmes. Promising results in smaller studies are not replicated in larger studies. Prospective studies are required to measure the effect of AI in clinical practice. Such studies will require clear stopping rules to ensure that AI does not reduce programme specificity. Study registration Protocol registered as PROSPERO CRD42020213590.
To determine whether the recommended screening interval for diabetic retinopathy (DR) in the UK can safely be extended beyond 1 year. Systematic review of clinical and cost-effectiveness studies. Nine databases were searched with no date restrictions. Randomised controlled trials (RCTs), cohort studies, prognostic or economic modelling studies which described the incidence and progression of DR in populations with type 1 diabetes mellitus or type 2 diabetes mellitus of either sex and of any age reporting incidence and progression of DR in relation to screening interval (vs annual screening interval) and/or prognostic factors were included. Narrative synthesis was undertaken. 14 013 papers were identified, of which 11 observational studies, 5 risk stratification modelling studies and 9 economic studies were included. Data were available for 262 541 patients of whom at least 228 649 (87%) had type 2 diabetes. There were no RCTs. Studies concluded that there is little difference between clinical outcomes from screening 1 yearly or 2 yearly in low-risk patients. However there was high loss to follow-up (13–31%), heterogeneity in definitions of low risk and variation in screening and grading protocols for prior retinopathy results. Observational and economic modelling studies in low-risk patients show little difference in clinical outcomes between 1-year and 2-year screening intervals. The lack of experimental research designs and heterogeneity in definition of low risk considerably limits the reliability and validity of this conclusion. Cost-effectiveness findings were mixed. There is insufficient evidence to recommend a move to extend the screening interval beyond 1 year.
Syndromic surveillance was able to provide near to real-time monitoring and could identify hourly changes in patterns of presentation during the London 2012 Olympic Games. Reassurance can be provided to planners of future mass-gathering events that there was no discernible impact in overall attendances to sentinel EDs or GP OOH services in the host country. The increase in attendances for alcohol-related causes during the opening ceremony, however, may provide an opportunity for future public health interventions. Todkill D , Hughes HE , Elliot AJ , Morbey RA , Edeghere O , Harcourt S , Hughes T , Endericks T , McCloskey B , Catchpole M , Ibbotson S , Smith G . An observational study using English syndromic surveillance data collected during the 2012 London Olympics - what did syndromic surveillance show and what can we learn for future mass-gathering events? Prehosp Disaster Med. 2016;31(6):628-634.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.