YGC-SLAM:基于改进的YOLOv5和几何约束的动态室内环境视觉SLAM

Q1 Computer Science
Juncheng ZHANG , Fuyang KE , Qinqin TANG , Wenming YU , Ming ZHANG
{"title":"YGC-SLAM:基于改进的YOLOv5和几何约束的动态室内环境视觉SLAM","authors":"Juncheng ZHANG ,&nbsp;Fuyang KE ,&nbsp;Qinqin TANG ,&nbsp;Wenming YU ,&nbsp;Ming ZHANG","doi":"10.1016/j.vrih.2024.05.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>As visual simultaneous localization and mapping (SLAM) is primarily based on the assumption of a static scene, the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation. In this study, we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.</div></div><div><h3>Methods</h3><div>First, the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function, whereby the prediction frame converges quickly for better detection. The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points. Subsequently, multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points, causing a failure of map building. The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points. Finally, a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.</div></div><div><h3>Results</h3><div>Through testing on TUM dataset and a real environment, the experimental results show that our algorithm reduces the absolute trajectory error by 98.22% and the relative trajectory error by 97.98% compared with the original ORB-SLAM2, which is more accurate and has better real-time performance than similar algorithms, such as DynaSLAM and DS-SLAM.</div></div><div><h3>Conclusions</h3><div>The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects, and the system can better complete positioning and map building tasks in complex environments.</div></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"7 1","pages":"Pages 62-82"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"YGC-SLAM:A visual SLAM based on improved YOLOv5 and geometric constraints for dynamic indoor environments\",\"authors\":\"Juncheng ZHANG ,&nbsp;Fuyang KE ,&nbsp;Qinqin TANG ,&nbsp;Wenming YU ,&nbsp;Ming ZHANG\",\"doi\":\"10.1016/j.vrih.2024.05.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>As visual simultaneous localization and mapping (SLAM) is primarily based on the assumption of a static scene, the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation. In this study, we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.</div></div><div><h3>Methods</h3><div>First, the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function, whereby the prediction frame converges quickly for better detection. The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points. Subsequently, multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points, causing a failure of map building. The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points. Finally, a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.</div></div><div><h3>Results</h3><div>Through testing on TUM dataset and a real environment, the experimental results show that our algorithm reduces the absolute trajectory error by 98.22% and the relative trajectory error by 97.98% compared with the original ORB-SLAM2, which is more accurate and has better real-time performance than similar algorithms, such as DynaSLAM and DS-SLAM.</div></div><div><h3>Conclusions</h3><div>The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects, and the system can better complete positioning and map building tasks in complex environments.</div></div>\",\"PeriodicalId\":33538,\"journal\":{\"name\":\"Virtual Reality Intelligent Hardware\",\"volume\":\"7 1\",\"pages\":\"Pages 62-82\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Virtual Reality Intelligent Hardware\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2096579624000214\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579624000214","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

摘要

由于视觉同步定位和映射(SLAM)主要基于静态场景的假设,帧中存在动态物体会导致系统鲁棒性下降和位置估计不准确等问题。为了提高系统的定位精度和鲁棒性,在ORB-SLAM2框架的基础上,结合语义约束和几何约束,提出了一种用于室内动态环境的YGC-SLAM。方法首先,通过引入卷积块注意模型和改进的EIOU损失函数,提高YOLOv5的识别精度,使预测帧快速收敛,更好地进行检测;然后将改进的YOLOv5添加到跟踪线程中进行动态目标检测,消除动态点。随后,利用多视图几何约束进行重新判断,进一步消除动态点,同时保留更多有用的特征点,防止语义方法过度消除特征点导致地图构建失败。采用K-means聚类算法加速这一过程,快速计算并确定每一簇像素点的运动状态。最后,实现了关键帧的去冗余绘制策略,构建了清晰的三维密集静态点云图。结果通过在TUM数据集和真实环境上的测试,实验结果表明,与原始ORB-SLAM2相比,本文算法的绝对轨迹误差降低了98.22%,相对轨迹误差降低了97.98%,比DynaSLAM和DS-SLAM等同类算法精度更高,实时性更好。结论本研究提出的YGC-SLAM能有效消除动态目标的不利影响,系统能更好地完成复杂环境下的定位和地图构建任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
YGC-SLAM:A visual SLAM based on improved YOLOv5 and geometric constraints for dynamic indoor environments

Background

As visual simultaneous localization and mapping (SLAM) is primarily based on the assumption of a static scene, the presence of dynamic objects in the frame causes problems such as a deterioration of system robustness and inaccurate position estimation. In this study, we propose a YGC-SLAM for indoor dynamic environments based on the ORB-SLAM2 framework combined with semantic and geometric constraints to improve the positioning accuracy and robustness of the system.

Methods

First, the recognition accuracy of YOLOv5 was improved by introducing the convolution block attention model and the improved EIOU loss function, whereby the prediction frame converges quickly for better detection. The improved YOLOv5 was then added to the tracking thread for dynamic target detection to eliminate dynamic points. Subsequently, multi-view geometric constraints were used for re-judging to further eliminate dynamic points while enabling more useful feature points to be retained and preventing the semantic approach from over-eliminating feature points, causing a failure of map building. The K-means clustering algorithm was used to accelerate this process and quickly calculate and determine the motion state of each cluster of pixel points. Finally, a strategy for drawing keyframes with de-redundancy was implemented to construct a clear 3D dense static point-cloud map.

Results

Through testing on TUM dataset and a real environment, the experimental results show that our algorithm reduces the absolute trajectory error by 98.22% and the relative trajectory error by 97.98% compared with the original ORB-SLAM2, which is more accurate and has better real-time performance than similar algorithms, such as DynaSLAM and DS-SLAM.

Conclusions

The YGC-SLAM proposed in this study can effectively eliminate the adverse effects of dynamic objects, and the system can better complete positioning and map building tasks in complex environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Virtual Reality  Intelligent Hardware
Virtual Reality Intelligent Hardware Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
6.40
自引率
0.00%
发文量
35
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信