Swin transformer-based GAN for multi-modal medical image translation

Yan, Shouang; Wang, Chengyan; Chen, Weibo; Lyu, Jun

doi:10.3389/fonc.2022.942511

Cited by 27 publications

(11 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To quantitatively evaluate the performance, we used mean absolute error (MAE), SSIM, and peak signal-to-noise ratio (PSNR) metrics. The performances of the models were compared between modified U-Net and other DL models 17,23–25 . Augmentation techniques such as scaling, rotating, and flipping were implemented during the training process to enhance the dataset.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Clinical Feasibility of Deep Learning–Based Attenuation Correction Models for Tl-201 Myocardial Perfusion SPECT

Lim,

Park,

Lee

et al. 2024

Clin Nucl Med

View full text Add to dashboard Cite

Purpose We aimed to develop deep learning (DL)–based attenuation correction models for Tl-201 myocardial perfusion SPECT (MPS) images and evaluate their clinical feasibility. Patients and Methods We conducted a retrospective study of patients with suspected or known coronary artery disease. We proposed a DL-based image-to-image translation technique to transform non–attenuation-corrected images into CT-based attenuation-corrected (CTAC) images. The model was trained using a modified U-Net with structural similarity index (SSIM) loss and mean squared error (MSE) loss and compared with other models. Segment-wise analysis using a polar map and visual assessment for the generated attenuation-corrected (GENAC) images were also performed to evaluate clinical feasibility. Results This study comprised 657 men and 328 women (age, 65 ± 11 years). Among the various models, the modified U-Net achieved the highest performance with an average mean absolute error of 0.003, an SSIM of 0.990, and a peak signal-to-noise ratio of 33.658. The performance of the model was not different between the stress and rest datasets. In the segment-wise analysis, the myocardial perfusion of the inferior wall was significantly higher in GENAC images than in the non–attenuation-corrected images in both the rest and stress test sets (P < 0.05). In the visual assessment of patients with diaphragmatic attenuation, scores of 4 (similar to CTAC images) or 5 (indistinguishable from CTAC images) were assigned to most GENAC images (65/68). Conclusions Our clinically feasible DL-based attenuation correction models can replace the CT-based method in Tl-201 MPS, and it would be useful in case SPECT/CT is unavailable for MPS.

show abstract

Section: Methodsmentioning

confidence: 99%

“…The performances of the models were compared between modified U-Net and other DL models. 17,[23][24][25] Augmentation techniques such as scaling, rotating, and flipping were implemented during the training process to enhance the dataset.…”

Section: Implementation Detailsmentioning

confidence: 99%

Clinical Feasibility of Deep Learning–Based Attenuation Correction Models for Tl-201 Myocardial Perfusion SPECT

Lim,

Park,

Lee

et al. 2024

Clin Nucl Med

View full text Add to dashboard Cite

show abstract

“…After preprocessing, deep learning models were proposed to segment lesions, which greatly improved future work efficiency. This work selected a deep learning model method based on the transformer architecture (Swin transformer) because of its superiority in multiple domains [ 15 , 16 ]. The Swin transformer adopts a hierarchical design containing a total of four stages: each stage decreases the resolution of the input feature map and expands the perceptual field layer by layer, similar to a convolutional neural network.…”

Section: Methodsmentioning

confidence: 99%

Multimodal-based machine learning strategy for accurate and non-invasive prediction of intramedullary glioma grade and mutation status of molecular markers: a retrospective study

Wang

Song

et al. 2023

BMC Med

View full text Add to dashboard Cite

Background Determining the grade and molecular marker status of intramedullary gliomas is important for assessing treatment outcomes and prognosis. Invasive biopsy for pathology usually carries a high risk of tissue damage, especially to the spinal cord, and there are currently no non-invasive strategies to identify the pathological type of intramedullary gliomas. Therefore, this study aimed to develop a non-invasive machine learning model to assist doctors in identifying the intramedullary glioma grade and mutation status of molecular markers. Methods A total of 461 patients from two institutions were included, and their sagittal (SAG) and transverse (TRA) T2-weighted magnetic resonance imaging scans and clinical data were acquired preoperatively. We employed a transformer-based deep learning model to automatically segment lesions in the SAG and TRA phases and extract their radiomics features. Different feature representations were fed into the proposed neural networks and compared with those of other mainstream models. Results The dice similarity coefficients of the Swin transformer in the SAG and TRA phases were 0.8697 and 0.8738, respectively. The results demonstrated that the best performance was obtained in our proposed neural networks based on multimodal fusion (SAG-TRA-clinical) features. In the external validation cohort, the areas under the receiver operating characteristic curve for graded (WHO I–II or WHO III–IV), alpha thalassemia/mental retardation syndrome X-linked (ATRX) status, and tumor protein p53 (P53) status prediction tasks were 0.8431, 0.7622, and 0.7954, respectively. Conclusions This study reports a novel machine learning strategy that, for the first time, is based on multimodal features to predict the ATRX and P53 mutation status and grades of intramedullary gliomas. The generalized application of these models could non-invasively provide more tumor-specific pathological information for determining the treatment and prognosis of intramedullary gliomas.

show abstract

“…The transformer-based network architecture has demonstrated competitive performance in various generative models [45][46][47]. Inspired by this, we construct a score network named 'TransDiff ' , which use the swin transformer layer as the backbone, aiming to enhance its feature extraction capabilities and achieve self-attention from local to global.…”

Section: Sore Network 'Transdiff 'mentioning

confidence: 99%

Generation model meets swin transformer for unsupervised low-dose CT reconstruction

Li,

Sun,

Wang

et al. 2024

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

Computed Tomography (CT) has evolved into an indispensable tool for clinical diagnosis. Reducing radiation dose crucially minimizes adverse effects but may introduce noise and artifacts in reconstructed images, affecting diagnostic processes for physicians. Scholars have tackled deep learning training instability by exploring diffusion models. Given the scarcity of clinical data, we propose the Unsupervised Image Domain Score Generation model (UISG) for low-dose CT reconstruction. During training, normal-dose CT images are utilized as network inputs to train a score-based generative model that captures the prior distribution of CT images. In the iterative reconstruction, the initial CT image is obtained using a filtered back-projection algorithm. Subsequently, diffusion-based prior, high-frequency convolutional sparse coding prior, and data-consistency steps are employed to obtain the high-quality reconstructed image. Given the global characteristics of noise, the score network of the diffusion model utilizes a swin transformer structure to enhance the model's ability to capture long-range dependencies. Furthermore, convolutional sparse coding is applied exclusively to the high-frequency components of the image, to prevent over-smoothing or the loss of crucial anatomical details during the denoising process. Quantitative and qualitative results indicate that UISG outperforms competing methods in terms of denoising and generalization performance.

show abstract

Swin transformer-based GAN for multi-modal medical image translation

Cited by 27 publications

References 35 publications

Clinical Feasibility of Deep Learning–Based Attenuation Correction Models for Tl-201 Myocardial Perfusion SPECT

Clinical Feasibility of Deep Learning–Based Attenuation Correction Models for Tl-201 Myocardial Perfusion SPECT

Multimodal-based machine learning strategy for accurate and non-invasive prediction of intramedullary glioma grade and mutation status of molecular markers: a retrospective study

Generation model meets swin transformer for unsupervised low-dose CT reconstruction

Contact Info

Product

Resources

About