Research on target detection method based on attention mechanism and reinforcement learning

Q. Wang, Chenxi Xu, Hongwei Du, Yuxuan Liu, Yang Liu, Yujia Fu, Kai Li, Haobin Shi
{"title":"Research on target detection method based on attention mechanism and reinforcement learning","authors":"Q. Wang, Chenxi Xu, Hongwei Du, Yuxuan Liu, Yang Liu, Yujia Fu, Kai Li, Haobin Shi","doi":"10.1117/12.2668537","DOIUrl":null,"url":null,"abstract":"The development of intelligent manufacturing promotes the intellectualization of traditional navigation technology. Because actor-critic (AC) algorithm is difficult to converge in the actual application process, this paper uses the optimization algorithm of this method, which is called deep deterministic policy gradient (DDPG). Through the use of experience playback and dual network design, the learning rate can be greatly improved compared with the original algorithm. Because curiosity strategy has more advantages in alleviating sparse reward problem, this paper also takes curiosity mechanism as an internal reward exploration strategy and proposes the DDPG method based on improved curiosity mechanism to solve the problem that robots lack external reward in some complex environments and tasks cannot be completed. The simulation and real experiment results show that the proposed method is more stable when completing the navigation task and performs well in the long-distance autonomous navigation task.","PeriodicalId":137914,"journal":{"name":"International Conference on Artificial Intelligence, Virtual Reality, and Visualization","volume":"742 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Artificial Intelligence, Virtual Reality, and Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2668537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The development of intelligent manufacturing promotes the intellectualization of traditional navigation technology. Because actor-critic (AC) algorithm is difficult to converge in the actual application process, this paper uses the optimization algorithm of this method, which is called deep deterministic policy gradient (DDPG). Through the use of experience playback and dual network design, the learning rate can be greatly improved compared with the original algorithm. Because curiosity strategy has more advantages in alleviating sparse reward problem, this paper also takes curiosity mechanism as an internal reward exploration strategy and proposes the DDPG method based on improved curiosity mechanism to solve the problem that robots lack external reward in some complex environments and tasks cannot be completed. The simulation and real experiment results show that the proposed method is more stable when completing the navigation task and performs well in the long-distance autonomous navigation task.
基于注意机制和强化学习的目标检测方法研究
智能制造的发展促进了传统导航技术的智能化。由于actor-critic (AC)算法在实际应用过程中难以收敛,本文采用了该方法的优化算法,称为深度确定性策略梯度(deep deterministic policy gradient, DDPG)。通过使用经验回放和双网络设计,与原算法相比,学习率大大提高。由于好奇心策略在缓解稀疏奖励问题上更有优势,本文也将好奇心机制作为一种内部奖励探索策略,提出了基于改进好奇心机制的DDPG方法来解决机器人在一些复杂环境中缺乏外部奖励而无法完成任务的问题。仿真和实际实验结果表明,该方法在完成导航任务时更加稳定,在长距离自主导航任务中表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信