YGC-SLAM:A visual SLAM based on improved YOLOv5 and geometric constraints for dynamic indoor environments

Q1 Computer Science

Virtual Reality Intelligent Hardware Pub Date : 2025-02-01 DOI:10.1016/j.vrih.2024.05.001

Juncheng ZHANG , Fuyang KE , Qinqin TANG , Wenming YU , Ming ZHANG

{"title":"YGC-SLAM:A visual SLAM based on improved YOLOv5 and geometric constraints for dynamic indoor environments","authors":"Juncheng ZHANG , Fuyang KE , Qinqin TANG , Wenming YU , Ming ZHANG","doi":"10.1016/j.vrih.2024.05.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>As visual simultaneous localization and mapping (SLAM) is primarily based on the assumption of a static scene, the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation. In this study, we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.</div></div><div><h3>Methods</h3><div>First, the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function, whereby the prediction frame converges quickly for better detection. The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points. Subsequently, multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points, causing a failure of map building. The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points. Finally, a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.</div></div><div><h3>Results</h3><div>Through testing on TUM dataset and a real environment, the experimental results show that our algorithm reduces the absolute trajectory error by 98.22% and the relative trajectory error by 97.98% compared with the original ORB-SLAM2, which is more accurate and has better real-time performance than similar algorithms, such as DynaSLAM and DS-SLAM.</div></div><div><h3>Conclusions</h3><div>The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects, and the system can better complete positioning and map building tasks in complex environments.</div></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"7 1","pages":"Pages 62-82"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579624000214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

Abstract

Background

As visual simultaneous localization and mapping (SLAM) is primarily based on the assumption of a static scene, the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation. In this study, we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.

Methods

First, the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function, whereby the prediction frame converges quickly for better detection. The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points. Subsequently, multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points, causing a failure of map building. The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points. Finally, a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.

Results

Through testing on TUM dataset and a real environment, the experimental results show that our algorithm reduces the absolute trajectory error by 98.22% and the relative trajectory error by 97.98% compared with the original ORB-SLAM2, which is more accurate and has better real-time performance than similar algorithms, such as DynaSLAM and DS-SLAM.

Conclusions

The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects, and the system can better complete positioning and map building tasks in complex environments.

查看原文本刊更多论文

YGC-SLAM：基于改进的YOLOv5和几何约束的动态室内环境视觉SLAM

由于视觉同步定位和映射（SLAM）主要基于静态场景的假设，帧中存在动态物体会导致系统鲁棒性下降和位置估计不准确等问题。为了提高系统的定位精度和鲁棒性，在ORB-SLAM2框架的基础上，结合语义约束和几何约束，提出了一种用于室内动态环境的YGC-SLAM。方法首先，通过引入卷积块注意模型和改进的EIOU损失函数，提高YOLOv5的识别精度，使预测帧快速收敛，更好地进行检测；然后将改进的YOLOv5添加到跟踪线程中进行动态目标检测，消除动态点。随后，利用多视图几何约束进行重新判断，进一步消除动态点，同时保留更多有用的特征点，防止语义方法过度消除特征点导致地图构建失败。采用K-means聚类算法加速这一过程，快速计算并确定每一簇像素点的运动状态。最后，实现了关键帧的去冗余绘制策略，构建了清晰的三维密集静态点云图。结果通过在TUM数据集和真实环境上的测试，实验结果表明，与原始ORB-SLAM2相比，本文算法的绝对轨迹误差降低了98.22%，相对轨迹误差降低了97.98%，比DynaSLAM和DS-SLAM等同类算法精度更高，实时性更好。结论本研究提出的YGC-SLAM能有效消除动态目标的不利影响，系统能更好地完成复杂环境下的定位和地图构建任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊