View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions

Han, Zhizhong; Shang, Mingyang; Liu, Yu‐Shen; Zwicker, Matthias

doi:10.1609/aaai.v33i01.33018376

Cited by 118 publications

(58 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…MN40(%) MN10(%) 3DGAN [Wu and others, 2016] 83.3 91.0 PointNet++ [Qi and others, 2017] 91.9 -FoldingNet [Yang et al, 2018] 88.4 94.4 PANO [Sfikas and others, 2017] 90.7 91.1 Pairwise [Johns et al, 2016] 90.7 92.8 GIFT [Bai and others, 2017] 89.5 91.5 Domi [Wang and others, 2017] 92.2 -MVCNN [Su and others, 2015] 90.1 -Spherical [Cao et al, 2017] 93.31 -Rotation [Kanezaki et al, 2018] 92.37 94.39 SO-Net [Li and others, 2018] 90.9 94.1 SVSL [Han and others, 2019] 93.31 94.82 VIPGAN [Han et al, 2019a] 91 Attention visualization. We visualize the attention learned by 3DViewGraph under ModelNet40, which demonstrates how 3DViewGraph understands 3D shapes by analyzing views on a view graph.…”

Section: Methodsmentioning

confidence: 99%

“…Global features of 3D shapes can be learned from raw 3D representations, such as meshes, voxels, and point clouds. As an alternative, a number of works in 3D shape analysis employed multiple views [Su and others, 2015;Han et al, 2019b] as raw 3D representation, exploiting the advantage that multiple * Corresponding author: Yu-Shen Liu views can facilitate understanding of both manifold and nonmanifold 3D shapes via computer vision techniques. Therefore, effectively and efficiently aggregating comprehensive information over multiple views, is critical for the discriminability of learned features, especially in deep learning models.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

3DViewGraph: Learning Global Features for 3D Shapes from A Graph of Unordered Views with Attention

Han

Wang

Vong³

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

Self Cite

View full text Add to dashboard Cite

Learning global features by aggregating information over multiple views has been shown to be effective for 3D shape analysis. For view aggregation in deep learning models, pooling has been applied extensively. However, pooling leads to a loss of the content within views, and the spatial relationship among views, which limits the discriminability of learned features. We propose 3DViewGraph to resolve this issue, which learns 3D global features by more effectively aggregating unordered views with attention. Specifically, unordered views taken around a shape are regarded as view nodes on a view graph. 3DViewGraph first learns a novel latent semantic mapping to project low-level view features into meaningful latent semantic embeddings in a lower dimensional space, which is spanned by latent semantic patterns. Then, the content and spatial information of each pair of view nodes are encoded by a novel spatial pattern correlation, where the correlation is computed among latent semantic patterns. Finally, all spatial pattern correlations are integrated with attention weights learned by a novel attention mechanism. This further increases the discriminability of learned features by highlighting the unordered view nodes with distinctive characteristics and depressing the ones with appearance ambiguity. We show that 3DViewGraph outperforms state-of-theart methods under three large-scale benchmarks.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

3DViewGraph: Learning Global Features for 3D Shapes from A Graph of Unordered Views with Attention

Han

Wang

Vong³

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

Self Cite

View full text Add to dashboard Cite

show abstract

“…There are also methods jointly learning features from point clouds and multi-view projections [47]. It is also possible to treat point clouds and views as sequences [26,17,15], or to use unsupervised learning [16].…”

Section: Related Workmentioning

confidence: 99%

Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Pham

Hua

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

481

319

View full text Add to dashboard Cite

Deep learning techniques for point cloud data have demonstrated great potentials in solving classical problems in 3D computer vision such as 3D object classification and segmentation. Several recent 3D object classification methods have reported state-of-the-art performance on CAD model datasets such as ModelNet40 with high accuracy (∼92%). Despite such impressive results, in this paper, we argue that object classification is still a challenging task when objects are framed with real-world settings. To prove this, we introduce ScanObjectNN, a new real-world point cloud object dataset based on scanned indoor scene data. From our comprehensive benchmark, we show that our dataset poses great challenges to existing point cloud classification techniques as objects from real-world scans are often cluttered with background and/or are partial due to occlusions. We identify three key open problems for point cloud object classification, and propose new point cloud classification neural networks that achieve state-of-the-art performance on classifying objects with cluttered background. Our dataset and code are publicly available in our project page 1 .

show abstract

“…Recently, a series of unsupervised-learning models are proposed to learn from 3D point clouds [14], [15], [16], [17], [18]. For example, 3D GAN converts 3D points to 3D voxels [9], which introduces a lot of empty voxels and loses precision; LatentGAN handles 3D point clouds directly [10]; however, the decoder uses fully-connected layers, which does not explore specific geometric structures of 3D point clouds and requires a huge number of training parameters; and VIP-GAN uses recurrentneural-network-based architecture to solve multiple view interprediction tasks for each shape [19]; [20] learns a continuous signed distance function representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data. In this work, we use deep autoencoder to directly handle unorganized 3D points and propose graph-based operations to explore geometric structures of 3D point clouds.…”

Section: A Unsupervised Learningmentioning

confidence: 99%

Deep Unsupervised Learning of 3D Point Clouds via Graph Topology Inference and Filtering

Chen

Duan

Yang

et al. 2020

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

We propose a deep autoencoder with graph topology inference and filtering to achieve compact representations of unorganized 3D point clouds in an unsupervised manner. Many previous works discretize 3D points to voxels and then use latticebased methods to process and learn 3D spatial information; however, this leads to inevitable discretization errors. In this work, we try to handle raw 3D points without such compromise. The encoder of the proposed networks adopts similar architectures as in PointNet, which is a well-acknowledged method for supervised learning of 3D point clouds. The decoder of the proposed networks involves three novel modules: the folding module, the graph-topology-inference module, and the graphfiltering module. The folding module folds a canonical 2D lattice to the underlying surface of a 3D point cloud, achieving coarse reconstruction; the graph-topology-inference module learns a graph topology to represent pairwise relationships between 3D points, pushing the latent code to preserve both coordinates and pairwise relationships of points in 3D point clouds; and the graph-filtering module designs graph filters based on the learnt graph topology and refines the coarse reconstruction to obtain the final reconstruction. We further provide theoretical analyses of the proposed architecture. We provide an upper bound for the reconstruction loss and further show the superiority of graph smoothness over spatial smoothness as a prior to model 3D point clouds. In the experiments, we validate the proposed networks in three tasks, including 3D point clouds reconstruction, visualization, and transfer classification. The experimental results show that (1) the proposed networks outperform the state-ofthe-art methods in various tasks, including reconstruction and transfer classification; (2) a graph topology can be inferred as auxiliary information without specific supervision on graph topology inference; and (3) graph filtering refines the reconstruction, leading to better performances.

show abstract

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions

Cited by 118 publications

References 6 publications

3DViewGraph: Learning Global Features for 3D Shapes from A Graph of Unordered Views with Attention

3DViewGraph: Learning Global Features for 3D Shapes from A Graph of Unordered Views with Attention

Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data

Deep Unsupervised Learning of 3D Point Clouds via Graph Topology Inference and Filtering

Contact Info

Product

Resources

About