在ARORA模拟器中通过强化学习比较物理效果

Proceedings of the 33rd European Modeling & Simulation Symposium Pub Date : 1900-01-01 DOI:10.46354/i3m.2021.emss.015

Troyle Thomas, Armando Fandango, D. Reed, C. Hoayun, J. Hurter, Alexander Gutierrez, K. Brawner

{"title":"在ARORA模拟器中通过强化学习比较物理效果","authors":"Troyle Thomas, Armando Fandango, D. Reed, C. Hoayun, J. Hurter, Alexander Gutierrez, K. Brawner","doi":"10.46354/i3m.2021.emss.015","DOIUrl":null,"url":null,"abstract":"By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.","PeriodicalId":322169,"journal":{"name":"Proceedings of the 33rd European Modeling & Simulation Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator\",\"authors\":\"Troyle Thomas, Armando Fandango, D. Reed, C. Hoayun, J. Hurter, Alexander Gutierrez, K. Brawner\",\"doi\":\"10.46354/i3m.2021.emss.015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.\",\"PeriodicalId\":322169,\"journal\":{\"name\":\"Proceedings of the 33rd European Modeling & Simulation Symposium\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 33rd European Modeling & Simulation Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.46354/i3m.2021.emss.015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 33rd European Modeling & Simulation Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46354/i3m.2021.emss.015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

通过使用深度确定性策略梯度算法测试各种物理级别来训练自动驾驶汽车导航，本研究填补了物理级别对车辆行为影响的研究缺失，特别是对强化学习算法的影响。研究了PointGoal导航任务的度量:模拟器运行时间、训练步骤和通过(归一化逆)路径长度(SPL)度量加权成功的代理有效性。训练和测试在新型模拟器ARORA中进行，即用于快速智能体训练的现实开放环境。ARORA的目标是提供一个高保真的、开源的仿真平台，使用基于物理的运动、车辆建模和大规模地理特定城市环境中的连续动作空间。使用四个物理水平或模型来创建四种不同的训练课程条件，在使用为实验定义的所有物理水平的条件下，SPL最高，其中两个条件返回零值。未来的研究人员应该考虑在训练复杂物理车辆模型时提供足够的支持。运行时结果揭示了具有更好CPU的实验机器的好处，至少对于我们使用的仅向量观察来说是这样。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparing Physics Effects through Reinforcement Learning in the ARORA Simulator

By testing various physics levels for training autonomous-vehicle navigation using a deep deterministic policy gradient algorithm, the present study fills a lack of research on the impact of physics levels for vehicle behaviour, specifically for reinforcement-learning algorithms. Measures from a PointGoal Navigation task were investigated: simulator run-time, training steps, and agent effectiveness through the Success weighted by (normalised inverse) Path Length (SPL) measure. Training and testing occurred in the novel simulator ARORA, or A Realistic Open environment for Rapid Agent training. The goal of ARORA is to provide a high-fidelity, open-source platform for simulation, using physics-based movement, vehicle modelling, and a continuous action space within a large-scale geospecific city environment. Using four physics levels, or models, to create four different curriculum conditions for training, the SPL was highest for the condition using all physics levels defined for the experiment, with two conditions returning zero values. Future researchers should consider providing adequate support when training complex-physics vehicle models. The run-time results revealed a benefit for experimental machines with a better CPU, at least for the vector-only observations we employed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 33rd European Modeling & Simulation Symposium

自引率

0.00%

发文量