NIR/RGB image fusion for scene classification using deep neural networks

Soroush, Rahman; Baleghi, Yasser

doi:10.1007/s00371-022-02488-0

Cited by 13 publications

(5 citation statements)

References 83 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Due to the success of deep learning models across diverse computer vision tasks, researchers have also employed various deep learning models for indoor scene classification tasks. Soroush et al [29] introduced a novel fusion method that leverages near-infrared (NIR) and RGB data to enhance scene recognition and classification. Labinghisa et al [26] presented a scene recognition method based on image-based location awareness (IILAA), and clustering algorithms.…”

Section: Deep Learning Modelsmentioning

confidence: 99%

“…Wozniak et al [28] introduced a deep neural network algorithm for indoor place recognition using transfer learning to classify images from a humanoid robot. Soroush et al [29] presented new fusion techniques for scene recognition and classification, utilizing both NIR and RGB sensor data. Heikel et al [30] presented a novel approach by employing an object detector (YOLO) to detect indoor objects, which are then used as features for predicting room categories.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Indoor Scene Classification through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics

Khan,

Othman

2024

Computers

View full text Add to dashboard Cite

Indoor scene classification plays a pivotal role in enabling social robots to seamlessly adapt to their environments, facilitating effective navigation and interaction within diverse indoor scenes. By accurately characterizing indoor scenes, robots can autonomously tailor their behaviors, making informed decisions to accomplish specific tasks. Traditional methods relying on manually crafted features encounter difficulties when characterizing complex indoor scenes. On the other hand, deep learning models address the shortcomings of traditional methods by autonomously learning hierarchical features from raw images. Despite the success of deep learning models, existing models still struggle to effectively characterize complex indoor scenes. This is because there is high degree of intra-class variability and inter-class similarity within indoor environments. To address this problem, we propose a dual-stream framework that harnesses both global contextual information and local features for enhanced recognition. The global stream captures high-level features and relationships across the scene. The local stream employs a fully convolutional network to extract fine-grained local information. The proposed dual-stream architecture effectively distinguishes scenes that share similar global contexts but contain different localized objects. We evaluate the performance of the proposed framework on a publicly available benchmark indoor scene dataset. From the experimental results, we demonstrate the effectiveness of the proposed framework.

show abstract

Section: Deep Learning Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Indoor Scene Classification through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics

Khan,

Othman

2024

Computers

View full text Add to dashboard Cite

show abstract

“…Fully connected layers further transform and abstract the fused features, leading to an output layer that generates the final fused image. This architecture allows the network to leverage the strengths of each modality and enhance the understanding of the scene, making it a valuable tool in various applications [21].…”

Section: B Data Preprocessingmentioning

confidence: 99%

Utilizing Deep Convolutional Neural Networks and Non-Negative Matrix Factorization for Multi-Modal Image Fusion

Das,

Govindasamy,

Godla

et al. 2023

IJACSA

View full text Add to dashboard Cite

A key element of contemporary computer vision, image fusion tries to improve the quality and interpretability of images by combining complimentary data from several image sources or modalities. This paper offers a unique method for multi-modal image fusion, combining the benefits of Deep Convolutional Neural Networks (CNNs) and Non-Negative Matrix Factorization (NMF), by using current developments in deep learning and matrix factorization techniques. Deep CNNs have shown to be remarkably effective in extracting features from images, capturing complex patterns and discriminative data. A group of deep CNNs are trained using this suggested technique on a varied dataset of multi-modal images. With the help of these networks, which extract and encode pertinent characteristics from several modalities, information-rich representations may then be combined. Concatenating, the features that were derived from the CNNs throughout the fusion process results in a fused feature representation that perfectly expresses the input modalities. The main novelty is the two-stage integration of NMF: first, breaking down the fused feature representation into non-negative basis vectors and coefficients, and then, using NMF to further extract important patterns from the fused feature maps. The non-negativity requirement in NMF guarantees the preservation of the natural structures and characteristics present in the source images, resulting in fused images that are both aesthetically pleasing and semantically intelligible. Visual examination of the merged images demonstrates the method's capacity to successfully extract important information from several modalities. The better performance and robustness of the suggested approach, which has an accuracy of roughly 99.12%, are highlighted by comparison with existing fusion approaches.

show abstract

“…Literature [22] proposed a multi-feature fusion method based on weighted sequence fusion to obtain fused feature vectors of active and passive millimeter-wave urban and rural areas, which is applied with millimeterwave imaging target recognition, and the performance of this method is proved to be better than that based on the original feature vectors. Literature [23] adopts the method of RGB and NIR image fusion in order to improve the ability of scene classification, on the basis of which it proposes the technique based on the improvement of visually salient points, and after simulation experiments, it is proved that the comprehensive performance of this method has been improved to the original method. Literature [24] envisioned an LSTM network modeling mechanism built under the Tensorflow framework, which is more portable compared to the AR model, as reflected in the faster and simpler data input.…”

Section: Introductionmentioning

confidence: 99%

Research and realization of computer image recognition system based on digital projection technology

2024

Applied Mathematics and Nonlinear Sciences

View full text Add to dashboard Cite

In this paper, the image Gabor features extracted by Gabor wavelet are fused with the image grayscale map to construct the enhanced Gabor features, and then combined with the characteristics of Gabor wavelet and convolutional layer, the Gabor feature extraction module, parallel convolution module and spatial transformation pooling module are designed. The corresponding Gabor convolutional layer and Gabor convolutional neural network are constructed using the appropriate module in accordance with the image recognition task application scenario. The convex set projection image super-resolution reconstruction method is used in this paper to improve the resolution of images with low resolution. The construction of a computerized image recognition system involves combining a Gabor convolutional neural network and a convex set projection method. This system has been tested and found to have a recognition accuracy of 93.5% for object images. This system’s ability to accurately recognize low-resolution shadow-obscured face images is possible thanks to using the convex set projection method to reconstruct the image and recognize it accurately with an accuracy of up to 93.85%. This system’s recognition performance for complex images has been proven through experiments.

show abstract

NIR/RGB image fusion for scene classification using deep neural networks

Cited by 13 publications

References 83 publications

Indoor Scene Classification through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics

Indoor Scene Classification through Dual-Stream Deep Learning: A Framework for Improved Scene Understanding in Robotics

Utilizing Deep Convolutional Neural Networks and Non-Negative Matrix Factorization for Multi-Modal Image Fusion

Research and realization of computer image recognition system based on digital projection technology

Contact Info

Product

Resources

About