2022
DOI: 10.2139/ssrn.4104342
|View full text |Cite
|
Sign up to set email alerts
|

Upsampling Autoencoder for Self-Supervised Point Cloud Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(18 citation statements)
references
References 57 publications
0
18
0
Order By: Relevance
“…To address this issue, Point-MAE [10] employs a mini-PointNet [22] as the point embedding module to achieve permutation invariance. Similarily, MaskSurf [11] adds a normal prediction module to enhance point cloud understanding. Even though it has better performance than Point-MAE on realworld dataset, we argue that normal vectors are not sufficiently robust and descriptive to capture all the nuances in the data.…”
Section: B Self-supervised Representation Learningmentioning
confidence: 99%
See 3 more Smart Citations
“…To address this issue, Point-MAE [10] employs a mini-PointNet [22] as the point embedding module to achieve permutation invariance. Similarily, MaskSurf [11] adds a normal prediction module to enhance point cloud understanding. Even though it has better performance than Point-MAE on realworld dataset, we argue that normal vectors are not sufficiently robust and descriptive to capture all the nuances in the data.…”
Section: B Self-supervised Representation Learningmentioning
confidence: 99%
“…Next, we utilize an MLP to embed the center coordinates of visible patches into positional tokens L v T . We observe that most existing methods [10], [11], [34] based on MAE leverage standard Transformer [16] for self-supervised learning, which has a quadratic computational complexity and ignores the potential correlations between different data samples. Taking visible features tokens T v and positional tokens L v T as inputs, we propose an external attention based Transformer encoder to excavate deep high-level latent features while minimizing the computational cost.…”
Section: B External Attention-based Transformer Encodermentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, some recent approaches [10,37,61] have introduced cross-modalities such as images and text to enhance the pre-training of Masked Point Modeling tasks. There are also some methods [26,62] that improve the objective of the improved Masked Point Modeling task.…”
Section: Masked Autoencodersmentioning
confidence: 99%