Mining Multi-Center Heterogeneous Medical Data with Distributed Synthetic Learning

Chang, Qi; Yan, Zhennan; Zhou, Mu; Qu, Hui; Zhang, Han; Baskaran, Lohendran; Al’Aref, Subhi J.; zhang, shaoting; Metaxas, Dimitri

doi:10.21203/rs.3.rs-2015205/v1

Cited by 3 publications

(2 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The clinical practice of TVTCS has benefited not only from individual imaging technical progress, but also from fusion technology from multiple imaging modes, including computational modeling and mobile detection modes [ 77 ]. Overcoming the barriers stemming from data heterogeneity among the sorts of detection instruments to establish the common standard for clinicians is a clinical challenge [ 78 ]. A realistic stereoscopically anatomical model from the visible imaging datasets permits the developments of custom-tailored procedure strategies.…”

Section: Commentsmentioning

confidence: 99%

Multimodality Cardiovascular Imaging for Totally Video-Guided Thorascopic Cardiac Surgery

Jiang,

Huang,

Yin

et al. 2024

Rev. Cardiovasc. Med.

View full text Add to dashboard Cite

Totally video-guided thorascopic cardiac surgery (TVTCS) represents one of the most minimally invasive access routes to the heart. Its feasibility and safety can be guaranteed by an experienced surgeon with skilled operative techniques under the guidance of a video signal via thoracoscopy and the imaging from transesophageal echocardiography. At present, this surgical approach has been applied for atrioventricular valve disease, atrial septum defects plus and partial anomalous pulmonary venous drainage, cardiac tumors, hypertrophic obstructive cardiomyopathy, aortic valve disease, and atrial fibrillation. Multimodality cardiovascular imaging, including echocardiography, X-ray, computed tomography (CT), magnetic resonance imaging (MRI) and cardiac catheterization, provides morphologic characteristics and function status of the cardiovascular system and a comprehensive view of the target anatomy. In this review, the benefits of multimodality cardiovascular imaging are summarized for the clinical practice of TVTCS, including the preoperative preparation, intraoperative guidance and postoperative supervision. The disease categories are also individually reviewed on the basis of multimodality cardiovascular imaging, to ensure the feasibility and safety for TVTCS. Cardiovascular imaging technologies not only confirm who is a candidate for this surgical technique, but also provide technical support during the procedure and for postop follow to assess the clinical outcomes. Multimodality cardiovascular imaging is instrumental to provide the requirements to solve the problems for conduction of TVTCS; and to provide individualized protocols with high-resolution and real-time dynamic imaging fusion.

show abstract

Section: Commentsmentioning

confidence: 99%

Multimodality Cardiovascular Imaging for Totally Video-Guided Thorascopic Cardiac Surgery

Jiang,

Huang,

Yin

et al. 2024

Rev. Cardiovasc. Med.

View full text Add to dashboard Cite

show abstract

“…The acquisition of massive medical data is extremely expensive and time-consuming, especially in fields requiring high-precision equipment, such as MRI imaging, and long-term patient tracking, such as oncology [7] and neurodegenerative diseases [8]. In addition, the large-scale pre-training medical data are typically collected from multi-center into a centralized institution, which significantly increases the risk of exposing patient privacy as they have access to a rich set of personal patient information and routine data anonymization can not guarantee data privacy protection [9][10][11]. Given these challenges, it is critical to develop a new pre-training paradigm with high data efficiency for building medical foundation models from limited real-world pre-training datasets, which can effectively mitigate the issues of data scarcity, extensive resource requirements, and privacy concerns that currently impede the development of medical foundation models.…”

mentioning

confidence: 99%

Expertise-informed Generative AI Enables Ultra-High Data Efficiency for Building Generalist Medical Foundation Model

Yan,

Sun,

Tan

et al. 2024

Preprint

View full text Add to dashboard Cite

Generalist medical foundation models, pre-trained on massive medical datasets, have shown great potential as the next generation of medical artificial intelligence (AI). However, collecting millions of medical data is extremely expensive, time-consuming, and raises concerns over the high-risk leakage of sensitive private patient information. Here, we present a general framework that enables ultra-high data efficiency in building medical foundation models by leveraging expertise-informed generative AI to scale the limited pre-training dataset. Specifically, we follow this framework and propose a new foundation model DERETFound in ophthalmology, using only 16.7% (150,786 images) of the real-world colour fundus photography images required in the latest retinal foundation model RETFound (904,170 images, Y. Zhou et al, Nature 2023). By integrating expert insights into generative AI, we generate approximately one million synthetic data that are consistent with real retinal images in terms of physiological structures and feature distribution. DERETFound achieves comparable or even superior performance to RETFound on nine public datasets across four downstream tasks, including diabetic retinopathy grading, glaucoma diagnosis, age-related macular degeneration grading, multi-disease classification, and challenging external evaluation. In addition, DERETFound demonstrates competitively high label efficiency, saving over 50% of expert-annotated training data compared to RETFound on datasets for diabetic retinopathy grading. Our data-efficient framework challenges the classic view that building medical foundation models requires the collection of large amounts of real-world medical data as a prerequisite. The framework also provides an effective solution for any other diseases that were once discouraged from building foundation models due to limited data, which has profound significance for medical AI.

show abstract

A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer

Ding

Zhou²,

Wang

et al. 2023

Sci Data

Self Cite

View full text Add to dashboard Cite

The success of training computer-vision models heavily relies on the support of large-scale, real-world images with annotations. Yet such an annotation-ready dataset is difficult to curate in pathology due to the privacy protection and excessive annotation burden. To aid in computational pathology, synthetic data generation, curation, and annotation present a cost-effective means to quickly enable data diversity that is required to boost model performance at different stages. In this study, we introduce a large-scale synthetic pathological image dataset paired with the annotation for nuclei semantic segmentation, termed as Synthetic Nuclei and annOtation Wizard (SNOW). The proposed SNOW is developed via a standardized workflow by applying the off-the-shelf image generator and nuclei annotator. The dataset contains overall 20k image tiles and 1,448,522 annotated nuclei with the CC-BY license. We show that SNOW can be used in both supervised and semi-supervised training scenarios. Extensive results suggest that synthetic-data-trained models are competitive under a variety of model training settings, expanding the scope of better using synthetic images for enhancing downstream data-driven clinical tasks.

show abstract

Mining Multi-Center Heterogeneous Medical Data with Distributed Synthetic Learning

Cited by 3 publications

References 52 publications

Multimodality Cardiovascular Imaging for Totally Video-Guided Thorascopic Cardiac Surgery

Multimodality Cardiovascular Imaging for Totally Video-Guided Thorascopic Cardiac Surgery

Expertise-informed Generative AI Enables Ultra-High Data Efficiency for Building Generalist Medical Foundation Model

A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer

Contact Info

Product

Resources

About