Tanaka Kohsuke, Yuta Shintomi, Y. Okuyama, Taro Suzuki
{"title":"基于rl的高速自动驾驶奖励函数设计","authors":"Tanaka Kohsuke, Yuta Shintomi, Y. Okuyama, Taro Suzuki","doi":"10.1109/MCSoC57363.2022.00015","DOIUrl":null,"url":null,"abstract":"We aim to design a reward function for autonomous driving by reinforcement learning for achieving high-speed driving while maintaining training stability for reaching the racetrack's goal. High-speed driving is aggressive, such as running on the road's edge as fast as possible at corners. Thus, creating reinforcement learning agents that drive at high speeds and can reach a goal is difficult in racing competition situations because of running off the road or collisions with other objects. In general, human drivers see the road ahead and make control decisions. Therefore, we design a reward function to consider the road ahead depending on the driving speed. Through experiments in a simulator, we compared our proposed reward function with others proposed in previous works in terms of driving speed and the training stability about reaching the goal. As a result of the experiment, our proposed reward function achieves an improvement of lap time by 0.71 seconds (3 %) with only a 4.4 % loss in stability in reaching a goal compared to the most stable reward function proposed in previous work.","PeriodicalId":150801,"journal":{"name":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of Reward Functions for RL-based High-Speed Autonomous Driving\",\"authors\":\"Tanaka Kohsuke, Yuta Shintomi, Y. Okuyama, Taro Suzuki\",\"doi\":\"10.1109/MCSoC57363.2022.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We aim to design a reward function for autonomous driving by reinforcement learning for achieving high-speed driving while maintaining training stability for reaching the racetrack's goal. High-speed driving is aggressive, such as running on the road's edge as fast as possible at corners. Thus, creating reinforcement learning agents that drive at high speeds and can reach a goal is difficult in racing competition situations because of running off the road or collisions with other objects. In general, human drivers see the road ahead and make control decisions. Therefore, we design a reward function to consider the road ahead depending on the driving speed. Through experiments in a simulator, we compared our proposed reward function with others proposed in previous works in terms of driving speed and the training stability about reaching the goal. As a result of the experiment, our proposed reward function achieves an improvement of lap time by 0.71 seconds (3 %) with only a 4.4 % loss in stability in reaching a goal compared to the most stable reward function proposed in previous work.\",\"PeriodicalId\":150801,\"journal\":{\"name\":\"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MCSoC57363.2022.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MCSoC57363.2022.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design of Reward Functions for RL-based High-Speed Autonomous Driving
We aim to design a reward function for autonomous driving by reinforcement learning for achieving high-speed driving while maintaining training stability for reaching the racetrack's goal. High-speed driving is aggressive, such as running on the road's edge as fast as possible at corners. Thus, creating reinforcement learning agents that drive at high speeds and can reach a goal is difficult in racing competition situations because of running off the road or collisions with other objects. In general, human drivers see the road ahead and make control decisions. Therefore, we design a reward function to consider the road ahead depending on the driving speed. Through experiments in a simulator, we compared our proposed reward function with others proposed in previous works in terms of driving speed and the training stability about reaching the goal. As a result of the experiment, our proposed reward function achieves an improvement of lap time by 0.71 seconds (3 %) with only a 4.4 % loss in stability in reaching a goal compared to the most stable reward function proposed in previous work.