{"title":"SAFIT: Segmentation-Aware Scene Flow with Improved Transformer","authors":"Yukang Shi, Kaisheng Ma","doi":"10.1109/icra46639.2022.9811747","DOIUrl":null,"url":null,"abstract":"Scene flow prediction is a challenging task that aims at jointly estimating the 3D structure and 3D motion of dynamic scenes. The previous methods concentrate more on point-wise estimation instead of considering the correspondence between objects as well as lacking the sensation of high-level semantic knowledge. In this paper, we propose a concise yet effective method for scene flow prediction. The key idea is to extend the view of all points for computing point cloud features into object-level, thus simultaneously modeling the relationships of the object-level and point-level via an improved transformer. In addition, we introduce a novel unsupervised loss called segmentation-aware loss, which can model semanticaware details to help predict scene flow more accurately and robustly. Since this loss can be trained without any ground truth, it can be used in both supervised training and self-supervised training. Experiments on both supervised training and self-supervised training demonstrate the effectiveness of our method. On supervised training, 3.8%, 22.58%, 10.90% and 21.82 % accuracy boosts than FLOT [23] can be observed on FT3Ds, KITTIs, FT3Do and KITTIo datasets. On self-supervised scheme, 48.23% and 48.96% accuracy boost than PointPWC-Net [40] can be observed on KITTIo and KITTIs datasets.","PeriodicalId":341244,"journal":{"name":"2022 International Conference on Robotics and Automation (ICRA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icra46639.2022.9811747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Scene flow prediction is a challenging task that aims at jointly estimating the 3D structure and 3D motion of dynamic scenes. The previous methods concentrate more on point-wise estimation instead of considering the correspondence between objects as well as lacking the sensation of high-level semantic knowledge. In this paper, we propose a concise yet effective method for scene flow prediction. The key idea is to extend the view of all points for computing point cloud features into object-level, thus simultaneously modeling the relationships of the object-level and point-level via an improved transformer. In addition, we introduce a novel unsupervised loss called segmentation-aware loss, which can model semanticaware details to help predict scene flow more accurately and robustly. Since this loss can be trained without any ground truth, it can be used in both supervised training and self-supervised training. Experiments on both supervised training and self-supervised training demonstrate the effectiveness of our method. On supervised training, 3.8%, 22.58%, 10.90% and 21.82 % accuracy boosts than FLOT [23] can be observed on FT3Ds, KITTIs, FT3Do and KITTIo datasets. On self-supervised scheme, 48.23% and 48.96% accuracy boost than PointPWC-Net [40] can be observed on KITTIo and KITTIs datasets.