RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory

Zeng, Hui; Yang, Bin; Wang, Xiuqing; Liu, Jiwei; Fu, Dongmei

doi:10.3390/s19030529

Cited by 16 publications

(11 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To date, most proposed vision-based recognition methods, as mentioned in [ 39 ], have used additive Red Green Blue (RGB) colour model-based images for object detection. Nonetheless, as presented by [ 40 ], contrast, entropy, correlation, energy, the mean, and the standard deviation can be calculated from examined images.…”

Section: Preliminariesmentioning

confidence: 99%

Vision and RTLS Safety Implementation in an Experimental Human—Robot Collaboration Scenario

Slovák

Melicher

Šimovec

et al. 2021

Sensors

View full text Add to dashboard Cite

Human–robot collaboration is becoming ever more widespread in industry because of its adaptability. Conventional safety elements are used when converting a workplace into a collaborative one, although new technologies are becoming more widespread. This work proposes a safe robotic workplace that can adapt its operation and speed depending on the surrounding stimuli. The benefit lies in its use of promising technologies that combine safety and collaboration. Using a depth camera operating on the passive stereo principle, safety zones are created around the robotic workplace, while objects moving around the workplace are identified, including their distance from the robotic system. Passive stereo employs two colour streams that enable distance computation based on pixel shift. The colour stream is also used in the human identification process. Human identification is achieved using the Histogram of Oriented Gradients, pre-learned precisely for this purpose. The workplace also features autonomous trolleys for material supply. Unequivocal trolley identification is achieved using a real-time location system through tags placed on each trolley. The robotic workplace’s speed and the halting of its work depend on the positions of objects within safety zones. The entry of a trolley with an exception to a safety zone does not affect the workplace speed. This work simulates individual scenarios that may occur at a robotic workplace with an emphasis on compliance with safety measures. The novelty lies in the integration of a real-time location system into a vision-based safety system, which are not new technologies by themselves, but their interconnection to achieve exception handling in order to reduce downtimes in the collaborative robotic system is innovative.

show abstract

Section: Preliminariesmentioning

confidence: 99%

Vision and RTLS Safety Implementation in an Experimental Human—Robot Collaboration Scenario

Slovák

Melicher

Šimovec

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

“…Zhang et al [ 32 ] constructed a multi-stream network for extracting optical flow, depth and RGB features, and then connect feature channels from different modalities in fully connected layers. Zeng et al [ 33 ] first constructed and trained RGB-CNN and depth-CNN networks, and then trained multimodal feature learning networks to fine-tune parameters.…”

Section: Introductionmentioning

confidence: 99%

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Zhang

Guo

Wang

et al. 2021

Sensors

View full text Add to dashboard Cite

In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid is constructed by combining contextual information, with the motivation of combining multiscale depth features to improve the adaptability for targets of different sizes. An adaptive channel weighting (ACW) module weights the RGB-D feature channels to achieve efficient feature selection and information complementation. This paper also introduces a novel RGB-D dataset for human detection called RGBD-human, on which we verify the performance of the proposed algorithm. The experimental results show that AAFTS-net outperforms existing state-of-the-art methods and can maintain stable performance under conditions of frequent occlusion, low illumination and multiple poses.

show abstract

“…Most of the existing algorithms use either color images or depth images as the source of information. Few image fusion algorithms, however, have been proposed in the context of CNN-based image classification [3][4][5][6].…”

Section: Introductionmentioning

confidence: 99%

CNN-Based Obstacle Avoidance Using RGB-Depth Image Fusion

Mechal

Idrissi

Mesbah

2021

Lecture Notes in Electrical Engineering

View full text Add to dashboard Cite

In the last few years, deep learning has attracted wide interest and achieved great success in many computer vision related applications, such as image classification, object detection, object tracking, pose estimation and action recognition. One specific application that can greatly benefit from the recent advance of deep learning is robot vision-based obstacle avoidance. Vision-based obstacle avoidance systems are mostly based on classification algorithms. Most of these algorithms use either color images or depth images as the main source of information. In this paper, the aim is to investigate whether using information extracted from both types of images simultaneously would give better performance than using each one separately. To do this, we chose the convolutional neural network (CNN) as the classifier and HSVbased method to achieve the fusion. We tested this approach using two widely used pre-trained CNN architectures, namely Resnet-50 and GoogLeNet using a dataset locally collected. The results indicate that the image fusion-based classification algorithm achieve a higher accuracy (91.3%) than the one based on depth images (80.4%) but lower than the one based on color images (93.7%). These results can be partly explained by the fact that the used classifiers were pre-trained using color image datasets.

show abstract

RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory

Cited by 16 publications

References 47 publications

Vision and RTLS Safety Implementation in an Experimental Human—Robot Collaboration Scenario

Vision and RTLS Safety Implementation in an Experimental Human—Robot Collaboration Scenario

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

CNN-Based Obstacle Avoidance Using RGB-Depth Image Fusion

Contact Info

Product

Resources

About