Research on tooth surface design based on deep neural networks has recently achieved progress in terms of both accuracy and execution efficiency. However, unrealistic outputs are still a challenging issue, partially because of (1) the lack of semantic guidance, (2) the inability to discover and rectify false results, and (3) the lack of exploration of structural coherence in intermediate layers. In this paper, we present an approach to predict depth images for designed teeth based on a conditional generative adversarial network (CGAN) by incorporating semantic guidance. Moreover, the uncertainty of semantic inference is employed to improve the model outputs, and a structural coherence loss is proposed for adversarial learning to enhance the discrimination capability of the network in intermediate layers. We evaluate the performance of our approach with the Shining3D tooth dataset. The experimental results show that our method produces better results than the other available approaches in terms of accuracy.This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.