用于点云分析的三维定向编码

IF 3.4 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access Pub Date : 2024-10-02 DOI:10.1109/ACCESS.2024.3472301

Yoonjae Jung;Sang-Hyun Lee;Seung-Woo Seo

{"title":"用于点云分析的三维定向编码","authors":"Yoonjae Jung;Sang-Hyun Lee;Seung-Woo Seo","doi":"10.1109/ACCESS.2024.3472301","DOIUrl":null,"url":null,"abstract":"Extracting informative local features in point clouds is crucial for accurately understanding spatial information inside 3D point data. Previous works utilize either complex network designs or simple multi-layer perceptrons (MLP) to extract the local features. However, complex networks often incur high computational cost, whereas simple MLP may struggle to capture the spatial relations among local points effectively. These challenges limit their scalability to delicate and real-time tasks, such as autonomous driving and robot navigation. To address these challenges, we propose a novel 3D Directional Encoding Network (3D-DENet) capable of effectively encoding spatial relations with low computational cost. 3D-DENet extracts spatial and point features separately. The key component of 3D-DENet for spatial feature extraction is Directional Encoding (DE), which encodes the cosine similarity between direction vectors of local points and trainable direction vectors. To extract point features, we also propose Local Point Feature Multi-Aggregation (LPFMA), which integrates various aspects of local point features using diverse aggregation functions. By leveraging DE and LPFMA in a hierarchical structure, 3D-DENet efficiently captures both detailed spatial and high-level semantic features from point clouds. Experiments show that 3D-DENet is effective and efficient in classification and segmentation tasks. In particular, 3D-DENet achieves an overall accuracy of 90.7% and a mean accuracy of 90.1% on ScanObjectNN, outperforming the current state-of-the-art method while using only 47% floating point operations.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"12 ","pages":"144533-144543"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10703059","citationCount":"0","resultStr":"{\"title\":\"3D Directional Encoding for Point Cloud Analysis\",\"authors\":\"Yoonjae Jung;Sang-Hyun Lee;Seung-Woo Seo\",\"doi\":\"10.1109/ACCESS.2024.3472301\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extracting informative local features in point clouds is crucial for accurately understanding spatial information inside 3D point data. Previous works utilize either complex network designs or simple multi-layer perceptrons (MLP) to extract the local features. However, complex networks often incur high computational cost, whereas simple MLP may struggle to capture the spatial relations among local points effectively. These challenges limit their scalability to delicate and real-time tasks, such as autonomous driving and robot navigation. To address these challenges, we propose a novel 3D Directional Encoding Network (3D-DENet) capable of effectively encoding spatial relations with low computational cost. 3D-DENet extracts spatial and point features separately. The key component of 3D-DENet for spatial feature extraction is Directional Encoding (DE), which encodes the cosine similarity between direction vectors of local points and trainable direction vectors. To extract point features, we also propose Local Point Feature Multi-Aggregation (LPFMA), which integrates various aspects of local point features using diverse aggregation functions. By leveraging DE and LPFMA in a hierarchical structure, 3D-DENet efficiently captures both detailed spatial and high-level semantic features from point clouds. Experiments show that 3D-DENet is effective and efficient in classification and segmentation tasks. In particular, 3D-DENet achieves an overall accuracy of 90.7% and a mean accuracy of 90.1% on ScanObjectNN, outperforming the current state-of-the-art method while using only 47% floating point operations.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"12 \",\"pages\":\"144533-144543\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10703059\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10703059/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10703059/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

提取点云中信息丰富的局部特征对于准确理解三维点数据中的空间信息至关重要。以往的研究利用复杂的网络设计或简单的多层感知器（MLP）来提取局部特征。然而，复杂的网络往往会产生高昂的计算成本，而简单的 MLP 可能难以有效捕捉局部点之间的空间关系。这些挑战限制了它们在自动驾驶和机器人导航等精细和实时任务中的可扩展性。为了应对这些挑战，我们提出了一种新颖的三维定向编码网络（3D-DENet），它能够以较低的计算成本有效地编码空间关系。3D-DENet 可分别提取空间特征和点特征。用于空间特征提取的 3D-DENet 的关键组件是方向编码（DE），它对局部点的方向向量与可训练方向向量之间的余弦相似性进行编码。为了提取点特征，我们还提出了局部点特征多重聚合（LPFMA），它利用不同的聚合函数整合了局部点特征的各个方面。通过利用分层结构中的 DE 和 LPFMA，3D-DENet 可以从点云中有效捕捉到详细的空间特征和高级语义特征。实验表明，3D-DENet 在分类和分割任务中非常有效。特别是，3D-DENet 在 ScanObjectNN 上的总体准确率达到了 90.7%，平均准确率达到了 90.1%，超过了目前最先进的方法，同时只使用了 47% 的浮点运算。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

3D Directional Encoding for Point Cloud Analysis

Extracting informative local features in point clouds is crucial for accurately understanding spatial information inside 3D point data. Previous works utilize either complex network designs or simple multi-layer perceptrons (MLP) to extract the local features. However, complex networks often incur high computational cost, whereas simple MLP may struggle to capture the spatial relations among local points effectively. These challenges limit their scalability to delicate and real-time tasks, such as autonomous driving and robot navigation. To address these challenges, we propose a novel 3D Directional Encoding Network (3D-DENet) capable of effectively encoding spatial relations with low computational cost. 3D-DENet extracts spatial and point features separately. The key component of 3D-DENet for spatial feature extraction is Directional Encoding (DE), which encodes the cosine similarity between direction vectors of local points and trainable direction vectors. To extract point features, we also propose Local Point Feature Multi-Aggregation (LPFMA), which integrates various aspects of local point features using diverse aggregation functions. By leveraging DE and LPFMA in a hierarchical structure, 3D-DENet efficiently captures both detailed spatial and high-level semantic features from point clouds. Experiments show that 3D-DENet is effective and efficient in classification and segmentation tasks. In particular, 3D-DENet achieves an overall accuracy of 90.7% and a mean accuracy of 90.1% on ScanObjectNN, outperforming the current state-of-the-art method while using only 47% floating point operations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.