Flow-ICP：基于4D时间序列对齐的点云语义分割。

Applied optics Pub Date : 2025-09-20 DOI:10.1364/AO.562944

Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang

{"title":"Flow-ICP：基于4D时间序列对齐的点云语义分割。","authors":"Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang","doi":"10.1364/AO.562944","DOIUrl":null,"url":null,"abstract":"The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.","PeriodicalId":101299,"journal":{"name":"Applied optics","volume":"64 27","pages":"8068-8076"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Flow-ICP: semantic segmentation of point clouds based on 4D time-series alignment.\",\"authors\":\"Shuyi Tan, Chao Huang, Yi Zhang, Yang Wang\",\"doi\":\"10.1364/AO.562944\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.\",\"PeriodicalId\":101299,\"journal\":{\"name\":\"Applied optics\",\"volume\":\"64 27\",\"pages\":\"8068-8076\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied optics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1364/AO.562944\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied optics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1364/AO.562944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语义分割是激光雷达点云处理中的一项关键任务。利用时间信息为低能见度或稀疏观测区域提供上下文数据最近成为一个热门的研究方向，特别是在自动驾驶领域。然而，现有的方法往往过度依赖于过去的帧，导致无约束的逐帧叠加引起的累积误差（漂移）。本文提出了一种动态对齐历史帧记忆信息的方法，以确保与当前帧的观测结果保持一致，减少由于视点变化或物体运动引起的偏差，并确保更准确地捕获当前帧特征。此外，我们还引入了一种新的多尺度特征融合方法，利用时空（ST）方法提取ST特征，减少了二维距离图像坐标与三维笛卡尔输出之间的不一致性。该方法通过优化和融合对齐的信道特征来增强特征表示。在SemanticKITTI和SensatUrban数据集上对该方法进行了评估。实验结果表明，该方法在精度方面优于现有的最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Flow-ICP: semantic segmentation of point clouds based on 4D time-series alignment.

The semantic segmentation is a critical task in LiDAR point cloud processing. Leveraging temporal information to provide contextual data for regions with low visibility or sparse observations has recently become a popular research direction, especially in autonomous driving. Existing methods, however, are often over-reliant on past frames, leading to cumulative errors (drift) caused by unconstrained frame-by-frame stacking. This paper proposes a dynamic alignment of historical frame memory information to ensure consistency with the observations of the current frame, reducing deviations caused by viewpoint changes or object movements and ensuring more accurate capture of current frame features. In addition, a new multi-scale feature fusion method, to the best of our knowledge, was introduced using the spatiotemporal (ST) method to extract the ST features, which reduces the inconsistencies between 2D range image coordinates and 3D Cartesian outputs. This approach enhances feature representation by optimizing and fusing the aligned channel features. This method was evaluated on the SemanticKITTI and SensatUrban datasets. The experimental results showed that it outperforms existing state-of-the-art methods regarding accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied optics

自引率

0.00%

发文量