Point Transformer for Shape Classification and Retrieval of Urban Roof Point Clouds

Shajahan, Dimple A.; Varma, Tarun; Muthuganapathy, Ramanathan

doi:10.1109/lgrs.2021.3061422

Cited by 8 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are also some methods [7,30,31] that use selfattention and transformer layers, which have revolutionised the study of machine translation and natural language processing, to directly process 3D points and make progress on point cloud processing tasks. However, they cannot be directly applied to the point-wise class prediction of large-scale point clouds because the self-attention layer will produce L 2 time complexity and memory usage.…”

Section: Point-based Methodsmentioning

confidence: 99%

“…Recently, there are some specially designed deep neural networks used to process 3D point clouds. These methods can be categorised into the following categories: (1) pointbased methods [1][2][3][4][5][6][7][8][9][10] that directly operate 3D points and output semantic information. (2) Voxel-based methods [11][12][13][14][15][16][17][18] that voxelise point clouds into 3D grids and then use 3D CNNs to process these 3D grids.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation

Fang

Xiong

Liu

2022

IET Computer Vision

View full text Add to dashboard Cite

Effective and efficient semantic segmentation of 3D point cloud data is important for many tasks. Many methods for point cloud semantic segmentation rely on computationally expensive sampling and grouping layers to process irregular points, while others convert irregular points into regular volumetric grids and process them with a 3D U-Netbased semantic segmentation network. However, most of these methods suffer from high computational costs and cannot be applied to the real-time processing of large-scale point clouds. To address these issues, we propose a computationally efficient point-voxel-based network architecture named Sparse Point-Voxel Aggregation Network (SPVAN) for point cloud semantic segmentation. It consists of an encoding layer that consists of sparse convolution and MLP layers and a new decoding layer called Point Feature Aggregation Layer (PFAL) that is only composed of feature interpolation and MLP layers. Compared with recent popular point-voxel-based methods with the U-Net-based network, our method does not need 3D convolution networks in the decoding layer and thus achieves a higher speed. Experimental results on the large-scale SemanticKITTI dataset show that our method gets a good balance between the efficiency and the performance. Moreover, our method achieves on-par or better performance than previous methods for semantic segmentation on the challenging S3DIS dataset.

show abstract

Section: Point-based Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation

Fang

Xiong

Liu

2022

IET Computer Vision

View full text Add to dashboard Cite

show abstract

“…By using the multi-head attention mechanism, each head can solve for a set of q, k, and v matrices. These matrices are then individually concatenated to obtain the matrices Q, K, and V. Following the practice of Point Transformer [42,43], we use subtraction instead of dot multiplication for the interaction between Q and K. Therefore, the attention score y is calculated according to the scaled dot-product attention:…”

Section: Multiscale Feature Extraction and Fusion (Mef) Unitmentioning

confidence: 99%

PU-CTG: A Point Cloud Upsampling Network Using Transformer Fusion and GRU Correction

Li,

Lin,

Cheng

et al. 2024

Remote Sensing

View full text Add to dashboard Cite

Point clouds are widely used in remote sensing applications, e.g., 3D object classification, semantic segmentation, and building reconstruction. Generating dense and uniformly distributed point clouds from low-density ones is beneficial to 3D point cloud applications. The traditional methods mainly focus on the global shape of 3D point clouds, thus ignoring detailed representations. The enhancement of detailed features is conducive to generating dense and uniform point clouds. In this paper, we propose a point cloud upsampling network to improve the detail construction ability, named PU-CTG. The proposed method is implemented by a cross-transformer-fused module and a GRU-corrected module. The aim of the cross-transformer module is to enable the interaction and effective fusion between different scales of features so that the network can capture finer features. The purpose of the gated recurrent unit (GRU) is to reconstruct fine-grained features by rectifying the feedback error. The experimental results demonstrate the effectiveness of our method. Furthermore, the ModelNet40 dataset is upsampled by PU-CTG, and the classification experiment is applied to PointNet to verify the promotion ability of this network.

show abstract

“…Li et al proposed the SO-Net model [6], which can systematically adjust the receptive field overlap to perform hierarchical feature extraction by performing point-to-node KNN search on SOM. The Transformer-based method [18,23,14,31] is an algorithm that has emerged in the past two years. Transformer is based on self-attention (SA), which was initially used for natural language processing, and then gradually applied to computer vision due to 2023/3/52 its strong feature representation ability.…”

Section: Introductionmentioning

confidence: 99%

3D Point Cloud Classification Method Based on Multiple Attention Mechanism and Dynamic Graph Convolution

Zhang,

Wang,

Zhu

2023

ITC

View full text Add to dashboard Cite

In order to solve the problem of uneven density and the low classification accuracy of 3D point cloud, a 3D point cloud classification method fuses multi-attention machine is proposed. It is principally based on the traditional point cloud dynamic graph convolution classification network, into multiple attention mechanisms, including self-attention, spatial attention and channel attention mechanisms. The self-attention mechanism can reduce the dependence on irrelevant points while aligning point clouds, and input the processed point cloud into the classification network. Then the missing geometric information in the classification network is compensated by the integration of spatial and channel attention mechanisms. The experimental results on the public data set ModelNet40 indicate that compared with the DGCNN classification network, the improved network model improves the classification accuracy of the data set by 0.5 % and the average accuracy by 0.9 %. Meantime, the classification accuracy outstrips other contrast classification algorithms.

show abstract

Point Transformer for Shape Classification and Retrieval of Urban Roof Point Clouds

Cited by 8 publications

References 14 publications

Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation

Sparse point‐voxel aggregation network for efficient point cloud semantic segmentation

PU-CTG: A Point Cloud Upsampling Network Using Transformer Fusion and GRU Correction

3D Point Cloud Classification Method Based on Multiple Attention Mechanism and Dynamic Graph Convolution

Contact Info

Product

Resources

About