Abstract:Virtual reality technology, with its continuous development, is gradually applied to healthcare, education, business, and other fields. In the application of the technology, position and attitude estimation, as a space positioning technology, is indispensable. Traditional pose estimation has the problems of high dependence on environment and great complexity. But convolutional neural network (CNN) and other technologies with computational intelligence provide a strong guarantee for the progress of pose estimat… Show more
“…However, the lack of real data for training the network makes it difficult to expand the network to new application scenarios, such as in the field of smart manufacturing [ 32 – 34 ] and autonomous driving [ 35 , 36 ]. For this purpose, we use virtual reality techniques [ 37 – 40 ] to produce datasets on weakly textured industrial parts. We independently design a series of comparative experiments to verify the advantages of using virtual reality technology to produce datasets, such as avoiding the problems of a single background, small changes in object position and pose, and easy overfitting that exist in real datasets of YCB videos [ 41 , 42 ].…”
This paper focuses on 6D pose estimation for weakly textured targets from RGB-D images. A 6D pose estimation algorithm (DOPE++) based on a deep neural network for weakly textured objects is proposed to solve the poor real-time pose estimation and low recognition efficiency in the robot grasping process of parts with weak texture. More specifically, we first introduce the depthwise separable convolution operation to lighten the original deep object pose estimation (DOPE) network structure to improve the network operation speed. Second, an attention mechanism is introduced to improve network accuracy. In response to the low recognition efficiency of the original DOPE network for parts with occlusion relationships and the false recognition problem in recognizing parts with scales that are too large or too small, a random mask local processing method and a multiscale fusion pose estimation module are proposed. The results show that our proposed DOPE++ network improves the real-time performance of 6D pose estimation and enhances the recognition of parts at different scales without loss of accuracy. To address the problem of a single background representation of the part pose estimation dataset, a virtual dataset is constructed for data expansion to form a hybrid dataset.
“…However, the lack of real data for training the network makes it difficult to expand the network to new application scenarios, such as in the field of smart manufacturing [ 32 – 34 ] and autonomous driving [ 35 , 36 ]. For this purpose, we use virtual reality techniques [ 37 – 40 ] to produce datasets on weakly textured industrial parts. We independently design a series of comparative experiments to verify the advantages of using virtual reality technology to produce datasets, such as avoiding the problems of a single background, small changes in object position and pose, and easy overfitting that exist in real datasets of YCB videos [ 41 , 42 ].…”
This paper focuses on 6D pose estimation for weakly textured targets from RGB-D images. A 6D pose estimation algorithm (DOPE++) based on a deep neural network for weakly textured objects is proposed to solve the poor real-time pose estimation and low recognition efficiency in the robot grasping process of parts with weak texture. More specifically, we first introduce the depthwise separable convolution operation to lighten the original deep object pose estimation (DOPE) network structure to improve the network operation speed. Second, an attention mechanism is introduced to improve network accuracy. In response to the low recognition efficiency of the original DOPE network for parts with occlusion relationships and the false recognition problem in recognizing parts with scales that are too large or too small, a random mask local processing method and a multiscale fusion pose estimation module are proposed. The results show that our proposed DOPE++ network improves the real-time performance of 6D pose estimation and enhances the recognition of parts at different scales without loss of accuracy. To address the problem of a single background representation of the part pose estimation dataset, a virtual dataset is constructed for data expansion to form a hybrid dataset.
“…As a cornerstone of computer vision, 2D HPE drives the prosperity and development of action recognition [1], pedestrian tracking [2], gesture recognition [3], gait recognition [4] and other related fields [5], [6]. Meanwhile, real-time 2D HPE extends its influence to daily activity scenarios, including intelligent video surveillance [7], patient monitoring systems [8], virtual reality [9], autonomous drive [10], human animation [11], smart home [12], [13], athlete-assisted training [14], etc.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.