MFFAE-Net：利用多尺度特征融合和注意力增强网络进行点云语义分割

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Vision and Applications Pub Date : 2024-08-12 DOI:10.1007/s00138-024-01589-1

Wei Liu, Yisheng Lu, Tao Zhang

{"title":"MFFAE-Net：利用多尺度特征融合和注意力增强网络进行点云语义分割","authors":"Wei Liu, Yisheng Lu, Tao Zhang","doi":"10.1007/s00138-024-01589-1","DOIUrl":null,"url":null,"abstract":"<p>Point cloud data can reflect more information about the real 3D space, which has gained increasing attention in computer vision field. But the unstructured and unordered nature of point clouds poses many challenges in their study. How to learn the global features of the point cloud in the original point cloud is a problem that has been accompanied by the research. In the research based on the structure of the encoder and decoder, many researchers focus on designing the encoder to better extract features, and do not further explore more globally representative features according to the features of the encoder and decoder. To solve this problem, we propose the MFFAE-Net method, which aims to obtain more globally representative point cloud features by using the feature learning of encoder decoder stage.Our method first enhances the feature information of the input point cloud by merging the information of its neighboring points, which is helpful for the following point cloud feature extraction work. Secondly, the channel attention module is used to further process the extracted features, so as to highlight the role of important channels in the features. Finally, we fuse features of different scales from encoding features and decoding features as well as features of the same scale, so as to obtain more global point cloud features, which will help improve the segmentation results of point clouds. Experimental results show that the method performs well on some objects in S3DIS dataset and Toronto3d dataset.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"8 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MFFAE-Net: semantic segmentation of point clouds using multi-scale feature fusion and attention enhancement networks\",\"authors\":\"Wei Liu, Yisheng Lu, Tao Zhang\",\"doi\":\"10.1007/s00138-024-01589-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Point cloud data can reflect more information about the real 3D space, which has gained increasing attention in computer vision field. But the unstructured and unordered nature of point clouds poses many challenges in their study. How to learn the global features of the point cloud in the original point cloud is a problem that has been accompanied by the research. In the research based on the structure of the encoder and decoder, many researchers focus on designing the encoder to better extract features, and do not further explore more globally representative features according to the features of the encoder and decoder. To solve this problem, we propose the MFFAE-Net method, which aims to obtain more globally representative point cloud features by using the feature learning of encoder decoder stage.Our method first enhances the feature information of the input point cloud by merging the information of its neighboring points, which is helpful for the following point cloud feature extraction work. Secondly, the channel attention module is used to further process the extracted features, so as to highlight the role of important channels in the features. Finally, we fuse features of different scales from encoding features and decoding features as well as features of the same scale, so as to obtain more global point cloud features, which will help improve the segmentation results of point clouds. Experimental results show that the method performs well on some objects in S3DIS dataset and Toronto3d dataset.</p>\",\"PeriodicalId\":51116,\"journal\":{\"name\":\"Machine Vision and Applications\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Vision and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00138-024-01589-1\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01589-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

点云数据可以反映真实三维空间的更多信息，在计算机视觉领域越来越受到关注。但是，点云的非结构化和无序性给点云的研究带来了诸多挑战。如何在原始点云中学习点云的全局特征是一直伴随着研究的问题。在基于编码器和解码器结构的研究中，很多研究者只注重设计编码器以更好地提取特征，并没有根据编码器和解码器的特点进一步探索更具全局代表性的特征。为了解决这个问题，我们提出了 MFFAE-Net 方法，旨在利用编码器解码器阶段的特征学习来获得更具全局代表性的点云特征。我们的方法首先通过合并输入点云相邻点的特征信息来增强输入点云的特征信息，这有助于接下来的点云特征提取工作。其次，利用通道关注模块对提取的特征进行进一步处理，从而突出重要通道在特征中的作用。最后，我们将编码特征和解码特征中不同尺度的特征以及相同尺度的特征进行融合，从而获得更多的全局点云特征，这将有助于改善点云的分割结果。实验结果表明，该方法在 S3DIS 数据集和 Toronto3d 数据集中的一些物体上表现良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

MFFAE-Net: semantic segmentation of point clouds using multi-scale feature fusion and attention enhancement networks

查看原文本刊更多论文

MFFAE-Net: semantic segmentation of point clouds using multi-scale feature fusion and attention enhancement networks

Point cloud data can reflect more information about the real 3D space, which has gained increasing attention in computer vision field. But the unstructured and unordered nature of point clouds poses many challenges in their study. How to learn the global features of the point cloud in the original point cloud is a problem that has been accompanied by the research. In the research based on the structure of the encoder and decoder, many researchers focus on designing the encoder to better extract features, and do not further explore more globally representative features according to the features of the encoder and decoder. To solve this problem, we propose the MFFAE-Net method, which aims to obtain more globally representative point cloud features by using the feature learning of encoder decoder stage.Our method first enhances the feature information of the input point cloud by merging the information of its neighboring points, which is helpful for the following point cloud feature extraction work. Secondly, the channel attention module is used to further process the extracted features, so as to highlight the role of important channels in the features. Finally, we fuse features of different scales from encoding features and decoding features as well as features of the same scale, so as to obtain more global point cloud features, which will help improve the segmentation results of point clouds. Experimental results show that the method performs well on some objects in S3DIS dataset and Toronto3d dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine Vision and Applications 工程技术-工程：电子与电气

CiteScore

6.30

自引率

3.00%

发文量

审稿时长

8.7 months

期刊介绍： Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal. Particular emphasis is placed on engineering and technology aspects of image processing and computer vision. The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.