{"title":"结合注意力门的双注意力 KPConv 网络用于 ALS 点云的语义分割","authors":"Jinbiao Zhao;Hangyu Zhou;Feifei Pan","doi":"10.1109/TGRS.2024.3422829","DOIUrl":null,"url":null,"abstract":"Kernel point convolution (KPConv) defines convolutional weights based on Euclidean distances between kernel points and input points and has shown good segmentation results on several datasets. However, it does not consider the intrinsic connection between input points and features, which is crucial for the semantic segmentation of airborne laser scanning (ALS) point clouds with sparse density and complex backgrounds. To address this problem, we design a dual attention KPConv network (DAKAG-Net) combined with attention gates for semantic segmentation of ALS point clouds. Specifically, we design the channel and spatial attention KPConv (CSAKPConv) block in the encoding process, which first performs adaptive feature refinement of the input mapping along two separate dimensions, channel and spatial, and then performs kernel point convolution. In addition, to enhance the use of high-level semantic information and detect objects of varying sizes, DAKAG-Net incorporates multiple attention gates (MAGs) that merge the lowest-level features, skip-connected features, and corresponding upsampled features during the decoding process. The decoded features are ultimately convolved with convolution kernels of various sizes and then merged to acquire multiscale perceptual field features. The proposed DAKAG-Net improves the OA, mF1, and mIoU by 3.5%, 3.1%, and 3.5%, respectively, compared with the baseline results on the ISPRS 3-D dataset, and yields the segmentation accuracy rates of 85.2% (OA), 73.7% (mF1), and 61.2% (mIoU). Moreover, the DAKAG-Net also obtains new state-of-the-art segmentation results on the DFC2019 dataset and the LASDU dataset.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Dual Attention KPConv Network Combined With Attention Gates for Semantic Segmentation of ALS Point Clouds\",\"authors\":\"Jinbiao Zhao;Hangyu Zhou;Feifei Pan\",\"doi\":\"10.1109/TGRS.2024.3422829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Kernel point convolution (KPConv) defines convolutional weights based on Euclidean distances between kernel points and input points and has shown good segmentation results on several datasets. However, it does not consider the intrinsic connection between input points and features, which is crucial for the semantic segmentation of airborne laser scanning (ALS) point clouds with sparse density and complex backgrounds. To address this problem, we design a dual attention KPConv network (DAKAG-Net) combined with attention gates for semantic segmentation of ALS point clouds. Specifically, we design the channel and spatial attention KPConv (CSAKPConv) block in the encoding process, which first performs adaptive feature refinement of the input mapping along two separate dimensions, channel and spatial, and then performs kernel point convolution. In addition, to enhance the use of high-level semantic information and detect objects of varying sizes, DAKAG-Net incorporates multiple attention gates (MAGs) that merge the lowest-level features, skip-connected features, and corresponding upsampled features during the decoding process. The decoded features are ultimately convolved with convolution kernels of various sizes and then merged to acquire multiscale perceptual field features. The proposed DAKAG-Net improves the OA, mF1, and mIoU by 3.5%, 3.1%, and 3.5%, respectively, compared with the baseline results on the ISPRS 3-D dataset, and yields the segmentation accuracy rates of 85.2% (OA), 73.7% (mF1), and 61.2% (mIoU). Moreover, the DAKAG-Net also obtains new state-of-the-art segmentation results on the DFC2019 dataset and the LASDU dataset.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"62 \",\"pages\":\"1-14\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10586745/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10586745/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A Dual Attention KPConv Network Combined With Attention Gates for Semantic Segmentation of ALS Point Clouds
Kernel point convolution (KPConv) defines convolutional weights based on Euclidean distances between kernel points and input points and has shown good segmentation results on several datasets. However, it does not consider the intrinsic connection between input points and features, which is crucial for the semantic segmentation of airborne laser scanning (ALS) point clouds with sparse density and complex backgrounds. To address this problem, we design a dual attention KPConv network (DAKAG-Net) combined with attention gates for semantic segmentation of ALS point clouds. Specifically, we design the channel and spatial attention KPConv (CSAKPConv) block in the encoding process, which first performs adaptive feature refinement of the input mapping along two separate dimensions, channel and spatial, and then performs kernel point convolution. In addition, to enhance the use of high-level semantic information and detect objects of varying sizes, DAKAG-Net incorporates multiple attention gates (MAGs) that merge the lowest-level features, skip-connected features, and corresponding upsampled features during the decoding process. The decoded features are ultimately convolved with convolution kernels of various sizes and then merged to acquire multiscale perceptual field features. The proposed DAKAG-Net improves the OA, mF1, and mIoU by 3.5%, 3.1%, and 3.5%, respectively, compared with the baseline results on the ISPRS 3-D dataset, and yields the segmentation accuracy rates of 85.2% (OA), 73.7% (mF1), and 61.2% (mIoU). Moreover, the DAKAG-Net also obtains new state-of-the-art segmentation results on the DFC2019 dataset and the LASDU dataset.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.