Yijun Chen;Xianwei Zheng;Zhulun Yang;Xutao Li;Jiantao Zhou;Yuanman Li
{"title":"DuPMAM: An Efficient Dual Perception Framework Equipped With a Sharp Testing Strategy for Point Cloud Analysis","authors":"Yijun Chen;Xianwei Zheng;Zhulun Yang;Xutao Li;Jiantao Zhou;Yuanman Li","doi":"10.1109/TMM.2024.3521735","DOIUrl":null,"url":null,"abstract":"The challenges in point cloud analysis are primarily attributed to the irregular and unordered nature of the data. Numerous existing approaches, inspired by the Transformer, introduce attention mechanisms to extract the 3D geometric features. However, these intricate geometric extractors incur high computational overhead and unfavorable inference latency. To tackle this predicament, in this paper, we propose a lightweight and faster attention-based network, named Dual Perception MAM (DuPMAM), for point cloud analysis. Specifically, we present a novel simple Point Multiplicative Attention Mechanism (PMAM). It is implemented solely through single feed-forward fully connected layers, hence leading to lower model complexity and superior inference speed. Based on that, we further devise a dual perception strategy by constructing both a local attention block and a global attention block to learn fine-grained geometric and overall representational features, respectively. Consequently, compared to the existing approaches, our method has excellent perception of local details and global contours of the point cloud objects. In addition, we ingeniously design a Graph-Multiscale Perceptual Field (GMPF) testing strategy for model performance enhancement. It has significant advantage over the traditional voting strategy and is generally applicable to point cloud tasks, encompassing classification, part segmentation and indoor scene segmentation. Empowered by the GMPF testing strategy, DuPMAM delivers the new State-of-the-Art on the real-world dataset ScanObjectNN, the synthetic dataset ModelNet40 and the part segmentation dataset ShapeNet, and compared to the recent GB-Net, our DuPMAM trains 6 times faster and tests 2 times faster.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1760-1771"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10817618/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The challenges in point cloud analysis are primarily attributed to the irregular and unordered nature of the data. Numerous existing approaches, inspired by the Transformer, introduce attention mechanisms to extract the 3D geometric features. However, these intricate geometric extractors incur high computational overhead and unfavorable inference latency. To tackle this predicament, in this paper, we propose a lightweight and faster attention-based network, named Dual Perception MAM (DuPMAM), for point cloud analysis. Specifically, we present a novel simple Point Multiplicative Attention Mechanism (PMAM). It is implemented solely through single feed-forward fully connected layers, hence leading to lower model complexity and superior inference speed. Based on that, we further devise a dual perception strategy by constructing both a local attention block and a global attention block to learn fine-grained geometric and overall representational features, respectively. Consequently, compared to the existing approaches, our method has excellent perception of local details and global contours of the point cloud objects. In addition, we ingeniously design a Graph-Multiscale Perceptual Field (GMPF) testing strategy for model performance enhancement. It has significant advantage over the traditional voting strategy and is generally applicable to point cloud tasks, encompassing classification, part segmentation and indoor scene segmentation. Empowered by the GMPF testing strategy, DuPMAM delivers the new State-of-the-Art on the real-world dataset ScanObjectNN, the synthetic dataset ModelNet40 and the part segmentation dataset ShapeNet, and compared to the recent GB-Net, our DuPMAM trains 6 times faster and tests 2 times faster.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.