Exploring RGB+Depth Fusion for Real-Time Object Detection

Ophoff, Tanguy; Beeck, Kristof Van; Goedemé, Toon

doi:10.3390/s19040866

Cited by 62 publications

(34 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These groups have used a variety of neural network architectures. Some authors input explicit depth and imaging data on independent channels that are processed separately through several network layers [4], [18], [29], [34]. After some processing, the depth and radiance channels are fused.…”

Section: Related Workmentioning

confidence: 99%

“…This design raises the question of which layer is best for merging the independent channels; one might expect that the answer depends on both the network and the data. Using a YOLOv2 network, [29] explored how variations in the merged layer influenced performance. The portions of their analysis most relevant to our work detected vehicles using data obtained from the KITTI database.…”

Section: Related Workmentioning

confidence: 99%

“…In prior work, authors developed deep learning architectures to combine radiance and depth information [4], [18], [29], [34]. The literature includes numerous innovative approaches for combining depth maps or point clouds with radiance data ( Figure 12A-C).…”

Section: Neural Network Architectures For Sensor Fusionmentioning

confidence: 99%

See 2 more Smart Citations

ISETAuto: Detecting Vehicles With Depth and Radiance Information

2021

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

ISETAuto: Detecting Vehicles With Depth and Radiance Information

2021

View full text Add to dashboard Cite

show abstract

“…Further, the authors introduce the 3D IoU with Volume of Overlap and Volume of Union for a better 3D bounding box proposal. Ophoff et al [44] propose a different approach. They use a separate network stream for the RGB and depth information each and fuse them by a concatenation layer.…”

Section: You Only Look Oncementioning

confidence: 99%

Concept Towards Segmenting Arm Areas for Robot-Based Dermatological In Vivo Measurements

Szymański

Sand

Tauscher

et al. 2021

TH Wildau Eng. Nat. Sci. Proc.

View full text Add to dashboard Cite

Dermatological in vivo measurements are used for various purposes, e.g. health care, development and testing of skin care products or claim support in marketing. Especially for the last two purposes, in vivo measurements are extensive due to the quantity and repeatability of the measurement series. Furthermore, they are performed manually and therefore represent a nonnegligible time and cost factor. A solution to this is the implementation of collaborative robotics for the measurement execution. Due to various body shapes and surface conditions, common static control procedures are not applicable. To solve this problem, spatial information obtained from a stereoscopic camera can be integrated into the robot control process. However, the designated measurement area has to be detected and the spatial information processed. Therefore the authors propose a concept towards segmenting arm areas through a CNN-based object detector and their further processing to perform robot-based in vivo measurements. The paper gives an overview of the utilization of RGB-D images in 2D object detectors and describes the selection of a suitable model for the application. Furthermore the creation, annotation and augmentation of a custom dataset is presented.

show abstract

“…Some detection methods using RGB-D two-stream network have achieved a good result. Ophoff et al [ 29 ] explored the best fusion position of RGB and depth information in the CNN, from which they concluded that the best results can be obtained by feature fusion towards the mid to late layers. Gupta et al [ 30 ] proposed a depth map encoding method called HHA(horizontal disparity, height above ground, and angle with respect to gravity), which can encode the depth map into a three-channel image like RGB images.…”

Section: Introductionmentioning

confidence: 99%

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Zhang

Guo

Wang

et al. 2021

Sensors

View full text Add to dashboard Cite

In recent years, human detection in indoor scenes has been widely applied in smart buildings and smart security, but many related challenges can still be difficult to address, such as frequent occlusion, low illumination and multiple poses. This paper proposes an asymmetric adaptive fusion two-stream network (AAFTS-net) for RGB-D human detection. This network can fully extract person-specific depth features and RGB features while reducing the typical complexity of a two-stream network. A depth feature pyramid is constructed by combining contextual information, with the motivation of combining multiscale depth features to improve the adaptability for targets of different sizes. An adaptive channel weighting (ACW) module weights the RGB-D feature channels to achieve efficient feature selection and information complementation. This paper also introduces a novel RGB-D dataset for human detection called RGBD-human, on which we verify the performance of the proposed algorithm. The experimental results show that AAFTS-net outperforms existing state-of-the-art methods and can maintain stable performance under conditions of frequent occlusion, low illumination and multiple poses.

show abstract

Exploring RGB+Depth Fusion for Real-Time Object Detection

Cited by 62 publications

References 31 publications

ISETAuto: Detecting Vehicles With Depth and Radiance Information

ISETAuto: Detecting Vehicles With Depth and Radiance Information

Concept Towards Segmenting Arm Areas for Robot-Based Dermatological In Vivo Measurements

Asymmetric Adaptive Fusion in a Two-Stream Network for RGB-D Human Detection

Contact Info

Product

Resources

About