Deep learning based on convolutional neural network (CNN) has attracted more and more attention in phase unwrapping of fringe projection three-dimensional (3D) measurement. However, due to the inherent limitations of convolutional operator, it is difficult to accurately determine the fringe order in wrapped phase patterns that rely on continuity and globality.To attack this problem, in this paper we develop a hybrid CNN-transformer model (Hformer) dedicated to phase unwrapping via fringe order prediction. The proposed Hformer model has a hybrid CNN-transformer architecture that is mainly composed of backbone, encoder, and decoder to take advantage of both CNN and transformer. Backbone is used as a wrapped phase pattern feature extractor. Encoder and decoder with cross attention are designed to enhance global dependency for the fringe order prediction. Experimental results show that the proposed Hformer model achieves better performance in fringe order prediction compared with the CNN models such as U-Net and DCNN. Our work opens an alternative way to the CNN-dominated deep learning phase unwrapping of fringe projection 3D measurement.
Recent approaches based on convolutional neural networks significantly improve the performance of structured light image depth estimation in structured light 3D measurement. However, it remains challenging to simultaneously preserve the global structure and local details of objects for the structured light images in complex scenes. In this paper, we design a parallel CNN-Transformer network, which consists of a CNN branch, a Transformer branch, a bidirectional feature fusion module (BFFM), and a cross-feature multi-scale fusion module (CFMS). The BFFM and CFMS modules are proposed to fuse local and global features of the double branches in order to achieve better depth estimation. Comprehensive experiments are conducted to evaluate our model on four structured light datasets, i.e., our established simulated fringe and speckle structured light datasets, and public real fringe and speckle structured light datasets. Experiments demonstrate that the proposed PCTNet is an effective architecture, achieving state-of-the-art performance in both qualitative and quantitative evaluation in in structured light 3D measurement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.