Graph Transformer for 3D point clouds classification and semantic segmentation

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computers & Graphics-Uk Pub Date : 2024-08-22 DOI:10.1016/j.cag.2024.104050

Wei Zhou , Qian Wang , Weiwei Jin , Xinzhe Shi , Ying He

{"title":"Graph Transformer for 3D point clouds classification and semantic segmentation","authors":"Wei Zhou , Qian Wang , Weiwei Jin , Xinzhe Shi , Ying He","doi":"10.1016/j.cag.2024.104050","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, graph-based and Transformer-based deep learning have demonstrated excellent performances on various point cloud tasks. Most of the existing graph-based methods rely on static graph, which take a fixed input to establish graph relations. Moreover, many graph-based methods apply maximizing and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points own the same influence on the centroid’s feature, which ignoring the correlation and difference between points. Most Transformer-based approaches extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the above issues of graph-based and Transformer-based models, we propose a new feature extraction block named Graph Transformer and construct a 3D point cloud learning network called GTNet to learn features of point clouds on local and global patterns. Graph Transformer integrates the advantages of graph-based and Transformer-based methods, and consists of Local Transformer that use intra-domain cross-attention and Global Transformer that use global self-attention. Finally, we use GTNet for shape classification, part segmentation and semantic segmentation tasks in this paper. The experimental results show that our model achieves good learning and prediction ability on most tasks. The source code and pre-trained model of GTNet will be released on <span><span>https://github.com/NWUzhouwei/GTNet</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"124 ","pages":"Article 104050"},"PeriodicalIF":2.5000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0097849324001857","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, graph-based and Transformer-based deep learning have demonstrated excellent performances on various point cloud tasks. Most of the existing graph-based methods rely on static graph, which take a fixed input to establish graph relations. Moreover, many graph-based methods apply maximizing and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points own the same influence on the centroid’s feature, which ignoring the correlation and difference between points. Most Transformer-based approaches extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the above issues of graph-based and Transformer-based models, we propose a new feature extraction block named Graph Transformer and construct a 3D point cloud learning network called GTNet to learn features of point clouds on local and global patterns. Graph Transformer integrates the advantages of graph-based and Transformer-based methods, and consists of Local Transformer that use intra-domain cross-attention and Global Transformer that use global self-attention. Finally, we use GTNet for shape classification, part segmentation and semantic segmentation tasks in this paper. The experimental results show that our model achieves good learning and prediction ability on most tasks. The source code and pre-trained model of GTNet will be released on https://github.com/NWUzhouwei/GTNet.

Abstract Image

查看原文本刊更多论文

用于三维点云分类和语义分割的图形变换器

最近，基于图的深度学习和基于变换器的深度学习在各种点云任务中表现出色。现有的基于图的方法大多依赖于静态图，即通过固定输入来建立图关系。此外，很多基于图的方法会对相邻点的特征进行最大化和平均化处理，从而导致只有单个相邻点会影响中心点的特征，或者不同相邻点对中心点特征的影响相同，从而忽略了点与点之间的相关性和差异性。大多数基于变换器的方法都是基于全局注意力提取点云特征，缺乏对局部邻点的特征学习。为了解决基于图和基于变换器模型的上述问题，我们提出了一种名为 Graph Transformer 的新特征提取模块，并构建了一个名为 GTNet 的三维点云学习网络，以学习局部和全局模式的点云特征。图形变换器集成了基于图形和基于变换器的方法的优点，由使用域内交叉注意的局部变换器和使用全局自注意的全局变换器组成。最后，本文将 GTNet 用于形状分类、部件分割和语义分割任务。实验结果表明，我们的模型在大多数任务上都取得了良好的学习和预测能力。GTNet 的源代码和预训练模型将在 https://github.com/NWUzhouwei/GTNet 上发布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers & Graphics-Uk 工程技术-计算机：软件工程

CiteScore

5.30

自引率

12.00%

发文量

173

审稿时长

38 days

期刊介绍： Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.