{"title":"Flow-ICP:基于4D时间序列对齐的点云语义分割。","authors":"Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang","doi":"10.1364/AO.562944","DOIUrl":null,"url":null,"abstract":"<p><p>The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.</p>","PeriodicalId":101299,"journal":{"name":"Applied optics","volume":"64 27","pages":"8068-8076"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Flow-ICP: semantic segmentation of point clouds based on 4D time-series alignment.\",\"authors\":\"Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang\",\"doi\":\"10.1364/AO.562944\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.</p>\",\"PeriodicalId\":101299,\"journal\":{\"name\":\"Applied optics\",\"volume\":\"64 27\",\"pages\":\"8068-8076\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied optics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1364/AO.562944\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied optics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1364/AO.562944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Flow-ICP: semantic segmentation of point clouds based on 4D time-series alignment.
The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.