Jianbo Zhang , Liang Yuan , Teng Ran , Jun Jia , Shuo Yang , Long Tang
{"title":"少即是多:一种有效的视觉动态SLAM目标特征提取方法","authors":"Jianbo Zhang , Liang Yuan , Teng Ran , Jun Jia , Shuo Yang , Long Tang","doi":"10.1016/j.displa.2025.103224","DOIUrl":null,"url":null,"abstract":"<div><div>Visual Simultaneous Localization and Mapping (VSLAM) is an essential foundation in augmented reality (AR) and mobile robotics. Dynamic scenes in the real world are a main challenge for VSLAM because it contravenes the fundamental assumptions based on static environments. Joint pose optimization with dynamic object modeling and camera pose estimation is a novel approach. However, it is challenging to model the motion of both the camera and the dynamic object when they are moving simultaneously. In this paper, we propose an efficient feature extraction approach for modeling dynamic object motion. We describe the object comprehensively through a more optimal feature selection strategy, which improves the performance of object tracking and pose estimation. The proposed approach combines image gradients and feature point clustering on dynamic objects. In the back-end optimization stage, we introduce rigid constraints on the dynamic object to optimize the poses using the graph model and obtain a high accuracy. The experimental results on the KITTI datasets demonstrate that the performance of the proposed approach is efficient and accurate.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"91 ","pages":"Article 103224"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Less is more: An effective method to extract object features for visual dynamic SLAM\",\"authors\":\"Jianbo Zhang , Liang Yuan , Teng Ran , Jun Jia , Shuo Yang , Long Tang\",\"doi\":\"10.1016/j.displa.2025.103224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visual Simultaneous Localization and Mapping (VSLAM) is an essential foundation in augmented reality (AR) and mobile robotics. Dynamic scenes in the real world are a main challenge for VSLAM because it contravenes the fundamental assumptions based on static environments. Joint pose optimization with dynamic object modeling and camera pose estimation is a novel approach. However, it is challenging to model the motion of both the camera and the dynamic object when they are moving simultaneously. In this paper, we propose an efficient feature extraction approach for modeling dynamic object motion. We describe the object comprehensively through a more optimal feature selection strategy, which improves the performance of object tracking and pose estimation. The proposed approach combines image gradients and feature point clustering on dynamic objects. In the back-end optimization stage, we introduce rigid constraints on the dynamic object to optimize the poses using the graph model and obtain a high accuracy. The experimental results on the KITTI datasets demonstrate that the performance of the proposed approach is efficient and accurate.</div></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"91 \",\"pages\":\"Article 103224\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938225002616\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225002616","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Less is more: An effective method to extract object features for visual dynamic SLAM
Visual Simultaneous Localization and Mapping (VSLAM) is an essential foundation in augmented reality (AR) and mobile robotics. Dynamic scenes in the real world are a main challenge for VSLAM because it contravenes the fundamental assumptions based on static environments. Joint pose optimization with dynamic object modeling and camera pose estimation is a novel approach. However, it is challenging to model the motion of both the camera and the dynamic object when they are moving simultaneously. In this paper, we propose an efficient feature extraction approach for modeling dynamic object motion. We describe the object comprehensively through a more optimal feature selection strategy, which improves the performance of object tracking and pose estimation. The proposed approach combines image gradients and feature point clustering on dynamic objects. In the back-end optimization stage, we introduce rigid constraints on the dynamic object to optimize the poses using the graph model and obtain a high accuracy. The experimental results on the KITTI datasets demonstrate that the performance of the proposed approach is efficient and accurate.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.