The same class of objects clustering process in a frame is known as semantic segmentation. The deep convolutional neural network-based semantic segmentation needs large-scale computations and annotations for data training to reach real-time inference speeds. The heterogeneous image segmentation is a more challenging task to categorize each pixel of an image. However, the heterogeneous image semantic segmentation method extracts the features of visible and thermal images separately. We designed an efficient architecture with the multi-hybrid-autoencoder and decoder for Faster Heterogeneous Image (FHI) Semantic Segmentation. The proposed corresponding architecture has fewer layers resulting in lower parameters, higher inference speed, and Intersection over Union (IoU). The specialty of this architecture is the discrete autonomous feature extraction framework for RGB image and Thermal (T) image inputs with individual convolutional layers. Later, we combined the 4-channels (RGBT) convolution features to reduce computational complexity and robust the model performances. The proposed FHI-Unet semantic segmentation model experimented on NVIDIA Xavier NX edge AI platforms with standard accuracy under the real-time inference requirement. The proposed FHI-Unet model has the highest mIoU of 43.67 and the fastest real-time inference of 83.39 frames per second on edge AI implementation. The proposed approach improves 31.36% inference speed, 7.16% mAcc, and 5.1% mIoU on the Multi-spectral Semantic Segmentation Dataset compared with the existing works.
The dehazing algorithms are based on the hazy simulation equation to remove haze and restore the input image feature maps by estimating the intensity coefficient of the atmospheric light source and the scattering coefficient of the atmosphere. However, the coefficient prediction isn't good, resulting in artifact noise in the dehazed output image. The increasing expansion of deep learning algorithms in computer vision applications to combat noise and interference in the hazy picture is growing. This paper proposed an efficient framework for Feature Integration and Block Smoothing (FIBS-Unet) Unet architecture using encoder-decoder processing with intensity attention block. We modified the Res2Net residual block with customized convolution and added instance normalization to improve the encoder feature extraction efficiency. Besides, we designed the Intensity Attention Block (IAB) using Sub-Pixel Layer and convolution (1 × 1) to amplify input feature and fusion feature maps. We developed an efficient decoder employing subpixel convolutions, concatenations, contrive convolutions, and multipliers to recover smooth and high-quality feature maps at the framework. The proposed FIBS-Unet has minimized the Mean Absolute Error (MAE) at perceptual loss function with the RESIDE dataset. We calculated the Peak Signal-to-Noise Ratio (PSNR), the Similarity Index Measure (SSIM), and a subjective visual color difference to evaluate the model's effectiveness. The proposed FIBS-Unet achieved better quality dehazing image results of PSNR:34.122 and SSIM:0.9890 in the outdoor scenarios at dense haze and backlight image for the Synthetic Objective Testing Set (SOTS). Our extensive experimental results specify that proposed FIBS-Unet is extendable to real-time applications.
The aim of this paper is to distinguish the vehicle detection and count the class number in each classification from the inputs. We proposed the use of Fuzzy Guided Scale Choice (FGSC)-based SSD deep neural network architecture for vehicle detection and class counting with parameter optimization. The ‘FGSC’ blocks are integrated into the convolutional layers of the model, which emphasize essential features while ignoring less important ones that are not significant for the operation. We created the passing detection lines and class counting windows and connected them with the proposed FGSC-SSD deep neural network model. The ‘FGSC’ blocks in the convolution layer emphasize essential features and find out unnecessary features by using the scale choice method at the training stage and eliminate that significant speedup of the model. In addition, FGSC blocks avoided many unusable parameters in the saturation interval and improved the performance efficiency. In addition, the Fuzzy Sigmoid Function (FSF) increases the activation interval through fuzzy logic. While performing operations, the FGSC-SSD model reduces the computational complexity of convolutional layers and their parameters. As a result, the model tested Frames Per Second (FPS) on edge artificial intelligence (AI) and reached a real-time processing speed of 38.4 and an accuracy rate of more than 94%. Therefore, this work might be considered an improvement to the traffic monitoring approach by using edge AI applications.
To incur the memory interface and faster access of static RAM for near-threshold operation, a stable local bit-line static random-access memory (SRAM) architecture has been proposed along with the low-voltage pre-charged and negative local bit-line (NLBL) scheme. In addition to the low-voltage pre-charged and NLBL scheme being operated by the write bit-line column to work out for the write half-select condition. The proposed local bit-line SRAM design reduces variations and enhances the read stability, the write capacity, prevents the bit-line leakage current, and the designed pre-charged circuit has achieved an optimal pre-charge voltage during the near-threshold operation. Compared to the conventional 6 T SRAM design, the optimal pre-charge voltage has been improved up to 15% for the read static noise margin (RSNM) and the write delay enriched up to 22% for the proposed NLBL SRAM design which is energy-efficient. At 400 mV supply voltage and 25 MHz operating frequency, the read and write energy consumption is 0.22 pJ and 0.23 pJ respectively. After comparing with the related works, the access average energy (AAE) is lower than in other works. The overall performance for the proposed local bit-line SRAM has achieved the highest figure of merit (FoM). The designed architecture has been implemented based on the 1-Kb SRAM macros and TSMC−40 nm GP process technology.
The discriminative object tracking system for unmanned aerial vehicles (UAVs) is widely used in numerous applications. While an ample amount of research has been carried out in this domain, implementing a low computational cost algorithm on a UAV onboard embedded system is still challenging. To address this issue, we propose a low computational complexity discriminative object tracking system for UAVs approach using the patch color group feature (PCGF) framework in this work. The tracking object is separated into several non-overlapping local image patches then the features are extracted into the PCGFs, which consist of the Gaussian mixture model (GMM). The object location is calculated by the similar PCGFs comparison from the previous frame and current frame. The background PCGFs of the object are removed by four directions feature scanning and dynamic threshold comparison, which improve the performance accuracy. In the terms of speed execution, the proposed algorithm accomplished 32.5 frames per second (FPS) on the x64 CPU platform without a GPU accelerator and 17 FPS in Raspberry Pi 4. Therefore, this work could be considered as a good solution for achieving a low computational complexity PCGF algorithm on a UAV onboard embedded system to improve flight times.
The modeling of human body kye-points is the most significant aspect of pose estimation appropriately. Computer vision algorithm identifies human pose, body-movement, and action in many ways. Most of the previous works taken advantage for finding accuracy or efficiency in terms of speed. However, many techniques suffer for intensive computational demands with low-latency or higher proceeding speed. We have designed a unique approach for single-person pose estimation and action recognition which is well suited for fitness application and mobility activities. The proposed framework has been developed with a base network that provides an initial pose to further refinement through Intensive Feature Consistency (IFC) network. The IFC network enforces high-level constraints on the global body intensity correction and local body part adjustments. The proposed module reduces the impact of body joint movement diversity by interpreting long-term consistent view. We have illustrated the effectiveness of proposed framework through pose estimation accuracy improvement with two benchmark datasets. Which is specified state-of the-art performance of IFC network under the required real-time processing speed on the CPU platform. The IFC network has improved 99.1% of PCK body and 94.7% of PCK torso accuracy under 31 FPS, which is comparatively higher than the existing work.INDEX TERMS Single person pose estimation, intensive feature consistency, global body intensity, local part adjustments, skeleton joint key-points.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.