Tengwei Li, Linzheng Ye, Xijing Zhu, Shida Chuai, Jialong Wu, Wanqi Zhang, Wenlong Li
{"title":"基于改进YOLO算法和多视点几何的VSLAM算法研究","authors":"Tengwei Li, Linzheng Ye, Xijing Zhu, Shida Chuai, Jialong Wu, Wanqi Zhang, Wenlong Li","doi":"10.1002/rob.22569","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Visual Simultaneous Localization and Mapping (VSLAM) uses camera sensors for environmental sensing and localization, widely applied in robotics, unmanned vehicles, and other sectors. Traditional VSLAMs typically assume static environments, but dynamic objects in such settings can cause feature point mismatches, significantly impairing system accuracy and robustness. Furthermore, existing dynamic VSLAMs suffer from issues like inadequate real-time performance. To tackle the challenges of dynamic environments, this paper adopts ORB-SLAM2 as the framework, integrates the YOLOv5 object detection module and a dynamic feature rejection module, and introduces a dynamic VSLAM system that leverages YOLO's object detection and motion geometry's depth fusion, termed YOLO Geometry Simultaneous Visual Localization and Mapping(YG-VSLAM). This paper's algorithm differs significantly from other dynamic algorithms, focusing on basic feature points for dynamic feature point identification and elimination. Initially, the algorithm's front-end extracts feature points from the input image. Concurrently, the target detection module identifies dynamic classes, delineating dynamic and static regions. Subsequently, a six-class region classification strategy is applied to further categorize these regions into more detailed categories, such as suspected dynamic and static classes. Finally, a multi-vision geometric method is employed to detect and eliminate feature points within each region. This paper conducts a comprehensive evaluation using the TUM data set, assessing both accuracy and real-time performance. The experimental outcomes demonstrate the algorithm's effectiveness and practicality.</p>\n </div>","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"42 7","pages":"3093-3104"},"PeriodicalIF":5.2000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on VSLAM Algorithm Based on Improved YOLO Algorithm and Multi-View Geometry\",\"authors\":\"Tengwei Li, Linzheng Ye, Xijing Zhu, Shida Chuai, Jialong Wu, Wanqi Zhang, Wenlong Li\",\"doi\":\"10.1002/rob.22569\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Visual Simultaneous Localization and Mapping (VSLAM) uses camera sensors for environmental sensing and localization, widely applied in robotics, unmanned vehicles, and other sectors. Traditional VSLAMs typically assume static environments, but dynamic objects in such settings can cause feature point mismatches, significantly impairing system accuracy and robustness. Furthermore, existing dynamic VSLAMs suffer from issues like inadequate real-time performance. To tackle the challenges of dynamic environments, this paper adopts ORB-SLAM2 as the framework, integrates the YOLOv5 object detection module and a dynamic feature rejection module, and introduces a dynamic VSLAM system that leverages YOLO's object detection and motion geometry's depth fusion, termed YOLO Geometry Simultaneous Visual Localization and Mapping(YG-VSLAM). This paper's algorithm differs significantly from other dynamic algorithms, focusing on basic feature points for dynamic feature point identification and elimination. Initially, the algorithm's front-end extracts feature points from the input image. Concurrently, the target detection module identifies dynamic classes, delineating dynamic and static regions. Subsequently, a six-class region classification strategy is applied to further categorize these regions into more detailed categories, such as suspected dynamic and static classes. Finally, a multi-vision geometric method is employed to detect and eliminate feature points within each region. This paper conducts a comprehensive evaluation using the TUM data set, assessing both accuracy and real-time performance. The experimental outcomes demonstrate the algorithm's effectiveness and practicality.</p>\\n </div>\",\"PeriodicalId\":192,\"journal\":{\"name\":\"Journal of Field Robotics\",\"volume\":\"42 7\",\"pages\":\"3093-3104\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Field Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/rob.22569\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22569","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
Research on VSLAM Algorithm Based on Improved YOLO Algorithm and Multi-View Geometry
Visual Simultaneous Localization and Mapping (VSLAM) uses camera sensors for environmental sensing and localization, widely applied in robotics, unmanned vehicles, and other sectors. Traditional VSLAMs typically assume static environments, but dynamic objects in such settings can cause feature point mismatches, significantly impairing system accuracy and robustness. Furthermore, existing dynamic VSLAMs suffer from issues like inadequate real-time performance. To tackle the challenges of dynamic environments, this paper adopts ORB-SLAM2 as the framework, integrates the YOLOv5 object detection module and a dynamic feature rejection module, and introduces a dynamic VSLAM system that leverages YOLO's object detection and motion geometry's depth fusion, termed YOLO Geometry Simultaneous Visual Localization and Mapping(YG-VSLAM). This paper's algorithm differs significantly from other dynamic algorithms, focusing on basic feature points for dynamic feature point identification and elimination. Initially, the algorithm's front-end extracts feature points from the input image. Concurrently, the target detection module identifies dynamic classes, delineating dynamic and static regions. Subsequently, a six-class region classification strategy is applied to further categorize these regions into more detailed categories, such as suspected dynamic and static classes. Finally, a multi-vision geometric method is employed to detect and eliminate feature points within each region. This paper conducts a comprehensive evaluation using the TUM data set, assessing both accuracy and real-time performance. The experimental outcomes demonstrate the algorithm's effectiveness and practicality.
期刊介绍:
The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments.
The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.