自动驾驶系统:基于A*和双q学习的方法开发

2021 7th International Conference on Web Research (ICWR) Pub Date : 2021-05-19 DOI:10.1109/ICWR51868.2021.9443139

Faezeh Jamshidi, Lei Zhang, Fahimeh Nezhadalinaei

{"title":"自动驾驶系统:基于A*和双q学习的方法开发","authors":"Faezeh Jamshidi, Lei Zhang, Fahimeh Nezhadalinaei","doi":"10.1109/ICWR51868.2021.9443139","DOIUrl":null,"url":null,"abstract":"Autonomous driving is the most attractive field to research by academic and industrial socials that intelligent transportation play a vital role in structure of autonomous driving systems. Artificial Intelligence (AI) is an infrastructure for autonomous driving by designing of intelligent machine. Deep Learning is one of subfields of Artificial Intelligence that create models by mimicking human brain’s functioning to make decision that it has shown great success in autonomous diving systems field. However, it performs very poorly in some stochastic environments caused by large overestimations of action values. Thus, we use the double estimator to Q-learning to construct Double Q-learning with a new off-policy reinforcement learning algorithm. By this algorithm, we can approximate the maximum expected value for any number of random variables and it underestimate rather than overestimate the maximum expected value. Moreover, we use an optimization method based on A* to improve routing in automation driving. Our proposed approach based on double Q-Learning and A* is evaluated on an example environment with random obstacles and compare results to use Q-Learning alone. Results show the proposed approach has better performance based on duration of trip to destination and collision to obstacles.","PeriodicalId":377597,"journal":{"name":"2021 7th International Conference on Web Research (ICWR)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Autonomous Driving Systems: Developing an Approach based on A* and Double Q-Learning\",\"authors\":\"Faezeh Jamshidi, Lei Zhang, Fahimeh Nezhadalinaei\",\"doi\":\"10.1109/ICWR51868.2021.9443139\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous driving is the most attractive field to research by academic and industrial socials that intelligent transportation play a vital role in structure of autonomous driving systems. Artificial Intelligence (AI) is an infrastructure for autonomous driving by designing of intelligent machine. Deep Learning is one of subfields of Artificial Intelligence that create models by mimicking human brain’s functioning to make decision that it has shown great success in autonomous diving systems field. However, it performs very poorly in some stochastic environments caused by large overestimations of action values. Thus, we use the double estimator to Q-learning to construct Double Q-learning with a new off-policy reinforcement learning algorithm. By this algorithm, we can approximate the maximum expected value for any number of random variables and it underestimate rather than overestimate the maximum expected value. Moreover, we use an optimization method based on A* to improve routing in automation driving. Our proposed approach based on double Q-Learning and A* is evaluated on an example environment with random obstacles and compare results to use Q-Learning alone. Results show the proposed approach has better performance based on duration of trip to destination and collision to obstacles.\",\"PeriodicalId\":377597,\"journal\":{\"name\":\"2021 7th International Conference on Web Research (ICWR)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Web Research (ICWR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWR51868.2021.9443139\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR51868.2021.9443139","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

自动驾驶是学术界和产业界最关注的研究领域，智能交通在自动驾驶系统的结构中起着至关重要的作用。人工智能(AI)是通过设计智能机器实现自动驾驶的基础设施。深度学习是人工智能的一个分支，通过模仿人类大脑的决策功能来创建模型，在自主潜水系统领域取得了巨大成功。然而，它在一些随机环境中表现非常差，这些环境是由对动作值的大量高估引起的。因此，我们将双估计量用于q学习，构造了一种新的非策略强化学习算法的双q学习。通过该算法，我们可以近似任意数量的随机变量的最大期望值，它低估而不是高估了最大期望值。此外，我们使用基于A*的优化方法来改进自动驾驶中的路径。我们提出的基于双Q-Learning和A*的方法在具有随机障碍的示例环境中进行了评估，并将结果与单独使用Q-Learning进行了比较。结果表明，基于行程时间和障碍物碰撞，该方法具有较好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Autonomous Driving Systems: Developing an Approach based on A* and Double Q-Learning

Autonomous driving is the most attractive field to research by academic and industrial socials that intelligent transportation play a vital role in structure of autonomous driving systems. Artificial Intelligence (AI) is an infrastructure for autonomous driving by designing of intelligent machine. Deep Learning is one of subfields of Artificial Intelligence that create models by mimicking human brain’s functioning to make decision that it has shown great success in autonomous diving systems field. However, it performs very poorly in some stochastic environments caused by large overestimations of action values. Thus, we use the double estimator to Q-learning to construct Double Q-learning with a new off-policy reinforcement learning algorithm. By this algorithm, we can approximate the maximum expected value for any number of random variables and it underestimate rather than overestimate the maximum expected value. Moreover, we use an optimization method based on A* to improve routing in automation driving. Our proposed approach based on double Q-Learning and A* is evaluated on an example environment with random obstacles and compare results to use Q-Learning alone. Results show the proposed approach has better performance based on duration of trip to destination and collision to obstacles.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 7th International Conference on Web Research (ICWR)

自引率

0.00%

发文量