A deep supervised transformer U‐shaped full‐resolution residual network for the segmentation of breast ultrasound image
Jiale Zhou,
Zuoxun Hou,
Hongyan Lu
et al.
Abstract:PurposeBreast ultrasound (BUS) is an important breast imaging tool. Automatic BUS image segmentation can measure the breast tumor size objectively and reduce doctors’ workload. In this article, we proposed a deep supervised transformer U‐shaped full‐resolution residual network (DSTransUFRRN) to segment BUS images.MethodsIn the proposed method, a full‐resolution residual stream and a deep supervision mechanism were introduced into TransU‐Net. The residual stream can keep full resolution features from different … Show more
“…This arises from the quadratic time and space complexity associated with the attention mechanism within the transformer architecture. For instance, U-Net++ models [ 94 ], which are based on CNNs, require approximately 9.163 million parameters to achieve a Dice score of 76.40 on the BUSI dataset [ 38 ]. In contrast, TransUnet [ 46 ], which secures a higher Dice score of 81.18 on the BUSI dataset, necessitates only about 44.00 million parameters [ 38 ].…”
Section: Discussionmentioning
confidence: 99%
“…For instance, U-Net++ models [ 94 ], which are based on CNNs, require approximately 9.163 million parameters to achieve a Dice score of 76.40 on the BUSI dataset [ 38 ]. In contrast, TransUnet [ 46 ], which secures a higher Dice score of 81.18 on the BUSI dataset, necessitates only about 44.00 million parameters [ 38 ]. Nevertheless, researchers must grapple with the intense demand for GPU resources to meet these demands.…”
Section: Discussionmentioning
confidence: 99%
“…He et al [ 38 ] introduced a hybrid CNN–transformer network (HCTNet) consisting of transformer encoder blocks (TEBlocks) in the encoder and a spatial-wise cross attention (SCA) module in the decoder to enhance breast lesion segmentation in BUS ultrasound images. Their application of the HCT network highlighted the importance of local features due to a unique computer kernel, though this focus led to difficulties in evaluating tumor-like shadows and speckle noise.…”
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges, such as a low imaging quality and high variability. There is a need to develop advanced automatic US image analysis methods to enhance its diagnostic accuracy and objectivity. Vision transformers, a recent innovation in machine learning, have demonstrated significant potential in various research fields, including general image analysis and computer vision, due to their capacity to process large datasets and learn complex patterns. Their suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformers and discusses their applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in their application in medical US image analysis. Vision transformers have shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and are expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
“…This arises from the quadratic time and space complexity associated with the attention mechanism within the transformer architecture. For instance, U-Net++ models [ 94 ], which are based on CNNs, require approximately 9.163 million parameters to achieve a Dice score of 76.40 on the BUSI dataset [ 38 ]. In contrast, TransUnet [ 46 ], which secures a higher Dice score of 81.18 on the BUSI dataset, necessitates only about 44.00 million parameters [ 38 ].…”
Section: Discussionmentioning
confidence: 99%
“…For instance, U-Net++ models [ 94 ], which are based on CNNs, require approximately 9.163 million parameters to achieve a Dice score of 76.40 on the BUSI dataset [ 38 ]. In contrast, TransUnet [ 46 ], which secures a higher Dice score of 81.18 on the BUSI dataset, necessitates only about 44.00 million parameters [ 38 ]. Nevertheless, researchers must grapple with the intense demand for GPU resources to meet these demands.…”
Section: Discussionmentioning
confidence: 99%
“…He et al [ 38 ] introduced a hybrid CNN–transformer network (HCTNet) consisting of transformer encoder blocks (TEBlocks) in the encoder and a spatial-wise cross attention (SCA) module in the decoder to enhance breast lesion segmentation in BUS ultrasound images. Their application of the HCT network highlighted the importance of local features due to a unique computer kernel, though this focus led to difficulties in evaluating tumor-like shadows and speckle noise.…”
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges, such as a low imaging quality and high variability. There is a need to develop advanced automatic US image analysis methods to enhance its diagnostic accuracy and objectivity. Vision transformers, a recent innovation in machine learning, have demonstrated significant potential in various research fields, including general image analysis and computer vision, due to their capacity to process large datasets and learn complex patterns. Their suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformers and discusses their applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in their application in medical US image analysis. Vision transformers have shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and are expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
“…He et al [38] introduced a hybrid CNN-transformer network (HCTNet) consisting of transformer encoder blocks (TEBlocks) in the encoder and a spatial-wise cross attention (SCA) module in the decoder to enhance breast lesion segmentation in BUS ultrasound images. Their application of the HCT network highlighted the importance of local features due to a unique computer kernel, though this focus led to difficulties in evaluating tumorlike shadows and speckle noise.…”
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges such as low imaging quality and high variability. There is a critical need to develop advanced automatic US image analysis methods to enhance diagnostic accuracy and objectivity. Vision transformer, a recent innovation in machine learning, has demonstrated significant potential in various research fields, including general image analysis and computer vision, due to its capacity to process large datasets and learn complex patterns. Its suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformer and discusses its applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in its application in medical US image analysis. Vision transformer has shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and is expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.