2015 IEEE International Conference on Computer Vision (ICCV) 2015
DOI: 10.1109/iccv.2015.114
|View full text |Cite
|
Sign up to set email alerts
|

Multi-view Convolutional Neural Networks for 3D Shape Recognition

Abstract: A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes' rendered views… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

11
2,064
0
8

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 2,477 publications
(2,083 citation statements)
references
References 34 publications
11
2,064
0
8
Order By: Relevance
“…In general, shape structures are defined by the arrangement of, and relations between, shape parts . Developing neural nets for structured shape representations requires a significant departure from existing works on convolutional neural networks (CNNs) for volumetric [Wu et al 2015;Girdhar et al 2016;Yumer and Mitra 2016;Wu et al 2016] or view-based [Su et al 2015;Qi et al 2016;Sinha et al 2016] shape representations. These works primarily adapt classical CNN architectures for image analysis.…”
Section: Introductionmentioning
confidence: 99%
“…In general, shape structures are defined by the arrangement of, and relations between, shape parts . Developing neural nets for structured shape representations requires a significant departure from existing works on convolutional neural networks (CNNs) for volumetric [Wu et al 2015;Girdhar et al 2016;Yumer and Mitra 2016;Wu et al 2016] or view-based [Su et al 2015;Qi et al 2016;Sinha et al 2016] shape representations. These works primarily adapt classical CNN architectures for image analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Other approaches that are based on the ideas similar to the one presented in this paper are rolling feature maps 1 and multi-view networks [26]. The former explores a pooling over a set of transformations, but does not guarantee the transformation-invariance of the features learned.…”
Section: Multiple Instance Learningmentioning
confidence: 99%
“…The experimental results show that this method can preserve the shape information of 3D objects to a certain extent through transformation, but the transformation process itself changes the local and global structures of 3D shapes, resulting in the decrease of feature discrimination. Meanwhile, Su et al [2] proposed multi-view convolution network structure (Multi-View CNN, MVCNN) [13]. These authors use the multi-view 2D projection of the 3D object to extract a concise 3D feature descriptor for the classification and retrieval of 3D shapes.…”
Section: Related Workmentioning
confidence: 99%
“…Due to point clouds is not in a regular format, most researchers transform this data to regular 3D voxel grids or collections of images before sending them to a deep network architecture. The method of feature learning for 3D object recognition in depth learning can be roughly divided into three methods which including Mutiview based [1,2], volumetric representation based [3,4] and based on point cloud. The multi-view based method is to project the three-dimension shape into the twodimension image space, and then use the method of depth learning to extract the two-dimension image.…”
Section: Introductionmentioning
confidence: 99%