Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation

Chattopadhyay, Soham; Basak, Hritam

doi:10.48550/arxiv.2009.06911

Cited by 6 publications

(7 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper, we compared the proposed SegMarsViT with existing lightweight semantic segmentation methods. We evaluate SegMarsViT against eight SOTA natural image semantic segmentation methods, including FCN [10], DeepLabV3+ [50], Segmenter [51], PSPNet [52], PSANet [53], SegFormer [38], and FPN-PoolFormer [54].…”

Section: Comparison With Sota Methodsmentioning

confidence: 99%

“…Long et al [6] first proposed a fully convolutional network (FCNet), which is a revolutionary work and the majority of following state-of-the-art (SOTA) studies are extensions of the FCN architecture. One of the most pioneering works is UNet presented by Ronneberger et al [7] for biomedical image segmentation, which adopts the influential encoder-decoder architecture and proved to be very useful for other types of image data [8][9][10][11]. Meanwhile, inspired by the high precision that CNNs achieved in semantic segmentation, many CNNs-based approaches were proposed for the Martian terrain segmentation (MTS) task.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SegMarsViT: Lightweight Mars Terrain Segmentation Network for Autonomous Driving in Planetary Exploration

et al. 2022

View full text Add to dashboard Cite

Planetary rover systems need to perform terrain segmentation to identify feasible driving areas and surround obstacles, which falls into the research area of semantic segmentation. Recently, deep learning (DL)-based methods were proposed and achieved great performance for semantic segmentation. However, due to the on-board processor platform’s strict comstraints on computational complexity and power consumption, existing DL approaches are almost impossible to be deployed on satellites under the burden of extensive computation and large model size. To fill this gap, this paper targeted studying effective and efficient Martian terrain segmentation solutions that are suitable for on-board satellites. In this article, we propose a lightweight ViT-based terrain segmentation method, namely, SegMarsViT. In the encoder part, the mobile vision transformer (MViT) block in the backbone extracts local–global spatial and captures multiscale contextual information concurrently. In the decoder part, the cross-scale feature fusion modules (CFF) further integrate hierarchical context information and the compact feature aggregation module (CFA) combines multi-level feature representation. Moreover, we evaluate the proposed method on three public datasets: AI4Mars, MSL-Seg, and S5Mars. Extensive experiments demonstrate that the proposed SegMarsViT was able to achieve 68.4%, 78.22%, and 67.28% mIoU on the AI4Mars-MSL, MSL-Seg, and S5Mars, respectively, under the speed of 69.52 FPS.

show abstract

Section: Comparison With Sota Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

SegMarsViT: Lightweight Mars Terrain Segmentation Network for Autonomous Driving in Planetary Exploration

et al. 2022

View full text Add to dashboard Cite

show abstract

“…Then, a variety of special information extraction structures are drawn according to this working mechanism to automatically learn and calculate the contribution of input data to the output data. Attention mechanisms have proven to be useful in fields such as scene segmentation [ 39 , 40 ], image understanding [ 41 , 42 ], fine-grained visual classification [ 43 , 44 ], and image inpainting [ 45 , 46 ].…”

Section: Related Workmentioning

confidence: 99%

Image-Based Pain Intensity Estimation Using Parallel CNNs with Regional Attention

Liang

et al. 2022

Bioengineering

View full text Add to dashboard Cite

Automatic pain estimation plays an important role in the field of medicine and health. In the previous studies, most of the entire image frame was directly imported into the model. This operation can allow background differences to negatively affect the experimental results. To tackle this issue, we propose the parallel CNNs framework with regional attention for automatic pain intensity estimation at the frame level. This modified convolution neural network structure combines BlurPool methods to enhance translation invariance in network learning. The improved networks can focus on learning core regions while supplementing global information, thereby obtaining parallel feature information. The core regions are mainly based on the tradeoff between the weights of the channel attention modules and the spatial attention modules. Meanwhile, the background information of the non-core regions is shielded by the DropBlock algorithm. These steps enable the model to learn facial pain features adaptively, not limited to a single image pattern. The experimental result of our proposed model outperforms many state-of-the-art methods on the RMSE and PCC metrics when evaluated on the diverse pain levels of over 12,000 images provided by the publicly available UNBC dataset. The model accuracy rate has reached 95.11%. The experimental results show that the proposed method is highly efficient at extracting the facial features of pain and predicts pain levels with high accuracy.

show abstract

“…Attention mechanisms are also widely known for boosting the performance of CNN-based models in different computer vision applications. Chattopadhyay et al [15] proposed a multi-scale attention mechanism which is inspired by the work of [9] for accurate localization and segmentation of objects. The dual attention mechanism was proposed by [19] adaptively integrates the local features with their corresponding global dependencies.…”

Section: Introductionmentioning

confidence: 99%

MFSNet: A Multi Focus Segmentation Network for Skin Lesion Segmentation

Basak,

Kundu,

Sarkar

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Segmentation is essential for medical image analysis to identify and localize diseases, monitor morphological changes, and extract discriminative features for further diagnosis. Skin cancer is one of the most common types of cancer globally, and its early diagnosis is pivotal for the complete elimination of malignant tumors from the body. This research develops an Artificial Intelligence (AI) framework for supervised skin lesion segmentation employing the deep learning approach. The proposed framework, called MFSNet (Multi-Focus Segmentation Network), uses differently scaled feature maps for computing the final segmentation mask using raw input RGB images of skin lesions. In doing so, initially, the images are preprocessed to remove unwanted artifacts and noises. The MFSNet employs the Res2Net backbone, a recently proposed convolutional neural network (CNN), for obtaining deep features used in a Parallel Partial Decoder (PPD) module to get a global map of the segmentation mask. In different stages of the network, convolution features and multi-scale maps are used in two boundary attention (BA) modules and two reverse attention (RA) modules to generate the final segmentation output. MFSNet, when evaluated on three publicly available datasets: P H 2 , ISIC 2017, and HAM10000, outperforms state-of-the-art methods, justifying the reliability of the framework. The relevant codes for the proposed approach are accessible at

show abstract

Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation

Cited by 6 publications

References 27 publications

SegMarsViT: Lightweight Mars Terrain Segmentation Network for Autonomous Driving in Planetary Exploration

SegMarsViT: Lightweight Mars Terrain Segmentation Network for Autonomous Driving in Planetary Exploration

Image-Based Pain Intensity Estimation Using Parallel CNNs with Regional Attention

MFSNet: A Multi Focus Segmentation Network for Skin Lesion Segmentation

Contact Info

Product

Resources

About