{"title":"评估端到端自动驾驶架构:模拟环境中的近端策略优化方法","authors":"Ângelo Morgado, Kaoru Ota, Mianxiong Dong, Nuno Pombo","doi":"10.1007/s43684-025-00102-3","DOIUrl":null,"url":null,"abstract":"<div><p>Autonomous driving systems (ADS) are at the forefront of technological innovation, promising enhanced safety, efficiency, and convenience in transportation. This study investigates the potential of end-to-end reinforcement learning (RL) architectures for ADS, specifically focusing on a Go-To-Point task involving lane-keeping and navigation through basic urban environments. The study uses the Proximal Policy Optimization (PPO) algorithm within the CARLA simulation environment. Traditional modular systems, which separate driving tasks into perception, decision-making, and control, provide interpretability and reliability in controlled scenarios but struggle with adaptability to dynamic, real-world conditions. In contrast, end-to-end systems offer a more integrated approach, potentially enhancing flexibility and decision-making cohesion.</p><p>This research introduces CARLA-GymDrive, a novel framework integrating the CARLA simulator with the Gymnasium API, enabling seamless RL experimentation with both discrete and continuous action spaces. Through a two-phase training regimen, the study evaluates the efficacy of PPO in an end-to-end ADS focused on basic tasks like lane-keeping and waypoint navigation. A comparative analysis with modular architectures is also provided. The findings highlight the strengths of PPO in managing continuous control tasks, achieving smoother and more adaptable driving behaviors than value-based algorithms like Deep Q-Networks. However, challenges remain in generalization and computational demands, with end-to-end systems requiring extensive training time.</p><p>While the study underscores the potential of end-to-end architectures, it also identifies limitations in scalability and real-world applicability, suggesting that modular systems may currently be more feasible for practical ADS deployment. Nonetheless, the CARLA-GymDrive framework and the insights gained from PPO-based ADS contribute significantly to the field, laying a foundation for future advancements in AD.</p></div>","PeriodicalId":71187,"journal":{"name":"自主智能系统(英文)","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43684-025-00102-3.pdf","citationCount":"0","resultStr":"{\"title\":\"Evaluating end-to-end autonomous driving architectures: a proximal policy optimization approach in simulated environments\",\"authors\":\"Ângelo Morgado, Kaoru Ota, Mianxiong Dong, Nuno Pombo\",\"doi\":\"10.1007/s43684-025-00102-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Autonomous driving systems (ADS) are at the forefront of technological innovation, promising enhanced safety, efficiency, and convenience in transportation. This study investigates the potential of end-to-end reinforcement learning (RL) architectures for ADS, specifically focusing on a Go-To-Point task involving lane-keeping and navigation through basic urban environments. The study uses the Proximal Policy Optimization (PPO) algorithm within the CARLA simulation environment. Traditional modular systems, which separate driving tasks into perception, decision-making, and control, provide interpretability and reliability in controlled scenarios but struggle with adaptability to dynamic, real-world conditions. In contrast, end-to-end systems offer a more integrated approach, potentially enhancing flexibility and decision-making cohesion.</p><p>This research introduces CARLA-GymDrive, a novel framework integrating the CARLA simulator with the Gymnasium API, enabling seamless RL experimentation with both discrete and continuous action spaces. Through a two-phase training regimen, the study evaluates the efficacy of PPO in an end-to-end ADS focused on basic tasks like lane-keeping and waypoint navigation. A comparative analysis with modular architectures is also provided. The findings highlight the strengths of PPO in managing continuous control tasks, achieving smoother and more adaptable driving behaviors than value-based algorithms like Deep Q-Networks. However, challenges remain in generalization and computational demands, with end-to-end systems requiring extensive training time.</p><p>While the study underscores the potential of end-to-end architectures, it also identifies limitations in scalability and real-world applicability, suggesting that modular systems may currently be more feasible for practical ADS deployment. Nonetheless, the CARLA-GymDrive framework and the insights gained from PPO-based ADS contribute significantly to the field, laying a foundation for future advancements in AD.</p></div>\",\"PeriodicalId\":71187,\"journal\":{\"name\":\"自主智能系统(英文)\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s43684-025-00102-3.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"自主智能系统(英文)\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s43684-025-00102-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"自主智能系统(英文)","FirstCategoryId":"1093","ListUrlMain":"https://link.springer.com/article/10.1007/s43684-025-00102-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating end-to-end autonomous driving architectures: a proximal policy optimization approach in simulated environments
Autonomous driving systems (ADS) are at the forefront of technological innovation, promising enhanced safety, efficiency, and convenience in transportation. This study investigates the potential of end-to-end reinforcement learning (RL) architectures for ADS, specifically focusing on a Go-To-Point task involving lane-keeping and navigation through basic urban environments. The study uses the Proximal Policy Optimization (PPO) algorithm within the CARLA simulation environment. Traditional modular systems, which separate driving tasks into perception, decision-making, and control, provide interpretability and reliability in controlled scenarios but struggle with adaptability to dynamic, real-world conditions. In contrast, end-to-end systems offer a more integrated approach, potentially enhancing flexibility and decision-making cohesion.
This research introduces CARLA-GymDrive, a novel framework integrating the CARLA simulator with the Gymnasium API, enabling seamless RL experimentation with both discrete and continuous action spaces. Through a two-phase training regimen, the study evaluates the efficacy of PPO in an end-to-end ADS focused on basic tasks like lane-keeping and waypoint navigation. A comparative analysis with modular architectures is also provided. The findings highlight the strengths of PPO in managing continuous control tasks, achieving smoother and more adaptable driving behaviors than value-based algorithms like Deep Q-Networks. However, challenges remain in generalization and computational demands, with end-to-end systems requiring extensive training time.
While the study underscores the potential of end-to-end architectures, it also identifies limitations in scalability and real-world applicability, suggesting that modular systems may currently be more feasible for practical ADS deployment. Nonetheless, the CARLA-GymDrive framework and the insights gained from PPO-based ADS contribute significantly to the field, laying a foundation for future advancements in AD.