Ming Han, J. Sha, Yanheng Wang, C. Ma, Xiang Zhang
{"title":"FNE-PCT: An Efficient Transformer Network for 3D Classification","authors":"Ming Han, J. Sha, Yanheng Wang, C. Ma, Xiang Zhang","doi":"10.1109/ICMA54519.2022.9856260","DOIUrl":null,"url":null,"abstract":"Detection or classification directly from 3D point clouds has received increasing attention in recent years. Transformer is more suitable for processing point cloud data than convolutional neural networks because of its inherent permutation invariance in processing sequences. However, common sampling strategies increase the training time of the model based on Transformer, such as Point Cloud Transformer (PCT). Aiming at the problem of slow inference speed of PCT, we propose a network structure named Fast Neighbor Embedding Point Cloud Transformer (FNE-PCT) in this paper. Instead of farthest point sample (FPS) and nearest neighbor search in PCT, FNE-PCT uses a fast neighbor embedding module to improve the inference speed and a residual self-attention encoding module to enhance the expression ability. Extensive experiments based on 3D object classification show that our FNE-PCT outperforms other excellent algorithms such as PointNet, PointNet++ and PointCNN. Our FNE-PCT achieves 92.6% accuracy on ModelNet40, which is on the same level as PCT. Meanwhile the speed is boosted up 29.2%, 43.6% and 52.9% than PCT respectively on ModelNet10, ModelNet40 and ShapeNetParts datasets.","PeriodicalId":120073,"journal":{"name":"2022 IEEE International Conference on Mechatronics and Automation (ICMA)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Mechatronics and Automation (ICMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMA54519.2022.9856260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Detection or classification directly from 3D point clouds has received increasing attention in recent years. Transformer is more suitable for processing point cloud data than convolutional neural networks because of its inherent permutation invariance in processing sequences. However, common sampling strategies increase the training time of the model based on Transformer, such as Point Cloud Transformer (PCT). Aiming at the problem of slow inference speed of PCT, we propose a network structure named Fast Neighbor Embedding Point Cloud Transformer (FNE-PCT) in this paper. Instead of farthest point sample (FPS) and nearest neighbor search in PCT, FNE-PCT uses a fast neighbor embedding module to improve the inference speed and a residual self-attention encoding module to enhance the expression ability. Extensive experiments based on 3D object classification show that our FNE-PCT outperforms other excellent algorithms such as PointNet, PointNet++ and PointCNN. Our FNE-PCT achieves 92.6% accuracy on ModelNet40, which is on the same level as PCT. Meanwhile the speed is boosted up 29.2%, 43.6% and 52.9% than PCT respectively on ModelNet10, ModelNet40 and ShapeNetParts datasets.