Multi‐modal deep network for RGB‐D segmentation of clothes

Joukovsky, Boris; Hu, Pengpeng; Munteanu, Adrian

doi:10.1049/el.2019.4150

Cited by 6 publications

(7 citation statements)

References 16 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, Linemod [8] locates and estimates object poses by extracting gradient features of images and normal features of depth images. Some other methods use deep learning to extract the RGBD feature, Shao et al [35] proposes two fusion strategies, the first is concatenates RGB and depth image into a raw input to the CNN network, and another strategy just like [3,36,37], they utilize CNN network to extract the RGB image and depth image features, and then concatenate the features as the input for object segmentation and pose estimation. However, these methods neglects the inner structure of the depth channel and extract depth image features as a supplement channel to the RGB feature channels.…”

Section: Pose From Rgbd Datamentioning

confidence: 99%

A 3D Keypoints Voting Network for 6DoF Pose Estimation in Indoor Scene

Liu

Zhang

et al. 2021

Machines

View full text Add to dashboard Cite

This paper addresses the problem of instance-level 6DoF pose estimation from a single RGBD image in an indoor scene. Many recent works have shown that a two-stage network, which first detects the keypoints and then regresses the keypoints for 6d pose estimation, achieves remarkable performance. However, the previous methods concern little about channel-wise attention and the keypoints are not selected by comprehensive use of RGBD information, which limits the performance of the network. To enhance RGB feature representation ability, a modular Split-Attention block that enables attention across feature-map groups is proposed. In addition, by combining the Oriented FAST and Rotated BRIEF (ORB) keypoints and the Farthest Point Sample (FPS) algorithm, a simple but effective keypoint selection method named ORB-FPS is presented to avoid the keypoints appear on the non-salient regions. The proposed algorithm is tested on the Linemod and the YCB-Video dataset, the experimental results demonstrate that our method outperforms the current approaches, achieves ADD(S) accuracy of 94.5% on the Linemod dataset and 91.4% on the YCB-Video dataset.

show abstract

Section: Pose From Rgbd Datamentioning

confidence: 99%

A 3D Keypoints Voting Network for 6DoF Pose Estimation in Indoor Scene

Liu

Zhang

et al. 2021

Machines

View full text Add to dashboard Cite

show abstract

“…© 2021 The Authors. The Journal of Engineering published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology problems such as large parameters and complex calculations due to the influence of network depth [12,13]. This leads to the problem of insufficient real-time performance for image segmentation.…”

Section: Introductionmentioning

confidence: 99%

“…The emergence of deep learning (DL) has opened up a new situation for the problem of image target segmentation and recognition, and has excellent performance for the recognition of a variety of small‐scale data sets [11]. At the same time, it is still necessary to see that traditional DL network models have problems such as large parameters and complex calculations due to the influence of network depth [12, 13]. This leads to the problem of insufficient real‐time performance for image segmentation.…”

Section: Introductionmentioning

confidence: 99%

Image semantic segmentation method based on GAN network and ENet model

Li¹

2021

J. eng.

View full text Add to dashboard Cite

Currently, image semantic segmentation has problems such as low accuracy and long running time. This paper proposes an image semantic segmentation method based on generative adversarial network and ENet model combined with deep neural network. This method first improves the network model of generative adversarial network. Ensure the high resolution of the generated image and achieve high similarity with the real image. While ensuring the high accuracy of image semantic segmentation, it effectively improves the real-time performance of network processing. The proposed method is verified based on public data sets. The experimental results show that the segmentation accuracy of this method can reach more than 93%, and the simulation running time is less than 0.171 s, which shows good high accuracy and high real-time performance. A feasible strategy is proposed for the further productization of semantic segmentation.

show abstract

“…Introduction: 3D registration is a classical and fundamental problem for countless applications. Since commodity depth cameras become less expensive and more accurate, depth images play an increasingly important role in numerous tasks [1]. In order to obtain comprehensive information from 3D scenery, point clouds captured from multiple views need to be aligned.…”

mentioning

confidence: 99%

“…The well-established method is iterative closest point (ICP) [2] based on which a myriad of flavours have been proposed. In ICP, given a source shape and a target shape, the following steps are performed: (1) for each point in the source shape, identify the closest corresponding point in the target shape; (2) predict the transformation by minimizing the mean square Euclidean distance between these correspondences; (3) transform the source shape using the predicted transformation from step 2; (4) iterate the above steps until the mean square distance reaches a pre-defined threshold. ICP and its variants are the dominating methods for the task of 3D registration.…”

mentioning

confidence: 99%

Method for registration of 3D shapes without overlap for known 3D priors

Munteanu

2021

Electronics Letters

Self Cite

View full text Add to dashboard Cite

In 3D registration of point clouds, the goal is to find an optimal transformation that aligns the input shapes, provided that they have some overlap. Existing methods suffer from performance degradation when the overlapping ratio between the neighbouring point clouds is small. So far, there is no existing method that can be adopted for aligning shapes with no overlap. In this letter, to the best of knowledge, the first method for the registration of 3D shapes without overlap, assuming that the shapes correspond to partial views of a known semi-rigid 3D prior is presented. The method is validated and compared to existing methods on FAUST, which is a known dataset used for human body reconstruction. Experimental results show that this approach can effectively align shapes without overlap. Compared to existing state-of-theart methods, this approach avoids iterative optimization and is robust to outliers and inherent inaccuracies induced by an initial rough alignment of the shapes. Introduction: 3D registration is a classical and fundamental problem for countless applications. Since commodity depth cameras become less expensive and more accurate, depth images play an increasingly important role in numerous tasks [1]. In order to obtain comprehensive information from 3D scenery, point clouds captured from multiple views need to be aligned. The well-established method is iterative closest point (ICP) [2] based on which a myriad of flavours have been proposed. In ICP, given a source shape and a target shape, the following steps are performed: (1) for each point in the source shape, identify the closest corresponding point in the target shape; (2) predict the transformation by minimizing the mean square Euclidean distance between these correspondences; (3) transform the source shape using the predicted transformation from step 2; (4) iterate the above steps until the mean square distance reaches a pre-defined threshold. ICP and its variants are the dominating methods for the task of 3D registration. However, ICP-based methods assume that the source and target shapes have been roughly aligned with a sufficient overlap.. Deep learning has shown its excellent ability to solve various problems which are difficult or impossible to address using traditional approaches. Recent research strives to explore 3D registration via deep learning [3], [4], [5], [6]. However, these methods are designed for shapes that partially overlap. In this letter, we present a novel deep learning method for 3D shape registration. Compared to the existing methods, the main advantage of our method is that we successfully handled the non-overlapping shape registration problem. We assume that the shapes correspond to partial views of a known semi-rigid 3D prior. This problem is impossible to be addressed using ICP due to the lack of point correspondences. This is addressed in this letter of which the main contributions can be summarized as follows:

show abstract

Multi‐modal deep network for RGB‐D segmentation of clothes

Cited by 6 publications

References 16 publications

A 3D Keypoints Voting Network for 6DoF Pose Estimation in Indoor Scene

A 3D Keypoints Voting Network for 6DoF Pose Estimation in Indoor Scene

Image semantic segmentation method based on GAN network and ENet model

Method for registration of 3D shapes without overlap for known 3D priors

Contact Info

Product

Resources

About