The same class of objects clustering process in a frame is known as semantic segmentation. The deep convolutional neural network-based semantic segmentation needs large-scale computations and annotations for data training to reach real-time inference speeds. The heterogeneous image segmentation is a more challenging task to categorize each pixel of an image. However, the heterogeneous image semantic segmentation method extracts the features of visible and thermal images separately. We designed an efficient architecture with the multi-hybrid-autoencoder and decoder for Faster Heterogeneous Image (FHI) Semantic Segmentation. The proposed corresponding architecture has fewer layers resulting in lower parameters, higher inference speed, and Intersection over Union (IoU). The specialty of this architecture is the discrete autonomous feature extraction framework for RGB image and Thermal (T) image inputs with individual convolutional layers. Later, we combined the 4-channels (RGBT) convolution features to reduce computational complexity and robust the model performances. The proposed FHI-Unet semantic segmentation model experimented on NVIDIA Xavier NX edge AI platforms with standard accuracy under the real-time inference requirement. The proposed FHI-Unet model has the highest mIoU of 43.67 and the fastest real-time inference of 83.39 frames per second on edge AI implementation. The proposed approach improves 31.36% inference speed, 7.16% mAcc, and 5.1% mIoU on the Multi-spectral Semantic Segmentation Dataset compared with the existing works.
The dehazing algorithms are based on the hazy simulation equation to remove haze and restore the input image feature maps by estimating the intensity coefficient of the atmospheric light source and the scattering coefficient of the atmosphere. However, the coefficient prediction isn't good, resulting in artifact noise in the dehazed output image. The increasing expansion of deep learning algorithms in computer vision applications to combat noise and interference in the hazy picture is growing. This paper proposed an efficient framework for Feature Integration and Block Smoothing (FIBS-Unet) Unet architecture using encoder-decoder processing with intensity attention block. We modified the Res2Net residual block with customized convolution and added instance normalization to improve the encoder feature extraction efficiency. Besides, we designed the Intensity Attention Block (IAB) using Sub-Pixel Layer and convolution (1 × 1) to amplify input feature and fusion feature maps. We developed an efficient decoder employing subpixel convolutions, concatenations, contrive convolutions, and multipliers to recover smooth and high-quality feature maps at the framework. The proposed FIBS-Unet has minimized the Mean Absolute Error (MAE) at perceptual loss function with the RESIDE dataset. We calculated the Peak Signal-to-Noise Ratio (PSNR), the Similarity Index Measure (SSIM), and a subjective visual color difference to evaluate the model's effectiveness. The proposed FIBS-Unet achieved better quality dehazing image results of PSNR:34.122 and SSIM:0.9890 in the outdoor scenarios at dense haze and backlight image for the Synthetic Objective Testing Set (SOTS). Our extensive experimental results specify that proposed FIBS-Unet is extendable to real-time applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.