{"title":"优化轮式装载机在梦境环境中的铲斗装载策略","authors":"Daniel Eriksson , Reza Ghabcheloo , Marcus Geimer","doi":"10.1016/j.autcon.2024.105804","DOIUrl":null,"url":null,"abstract":"<div><div>Reinforcement Learning (RL) requires many interactions with the environment to converge to an optimal strategy, which makes it unfeasible to apply to wheel loaders and the bucket filling problem without using simulators. However, it is difficult to model the pile dynamics in the simulator because of unknown parameters, which results in poor transferability from the simulation to the real environment. Instead, this paper uses world models, serving as a fast surrogate simulator, creating a dream environment where a reinforcement learning (RL) agent explores and optimizes its bucket-filling behavior. The trained agent is then deployed on a full-size wheel loader without modifications, demonstrating its ability to outperform the previous benchmark controller, which was synthesized using imitation learning. Additionally, the same performance was observed as that of a controller pre-trained with imitation learning and optimized on the test pile using RL.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":null,"pages":null},"PeriodicalIF":9.6000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing bucket-filling strategies for wheel loaders inside a dream environment\",\"authors\":\"Daniel Eriksson , Reza Ghabcheloo , Marcus Geimer\",\"doi\":\"10.1016/j.autcon.2024.105804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Reinforcement Learning (RL) requires many interactions with the environment to converge to an optimal strategy, which makes it unfeasible to apply to wheel loaders and the bucket filling problem without using simulators. However, it is difficult to model the pile dynamics in the simulator because of unknown parameters, which results in poor transferability from the simulation to the real environment. Instead, this paper uses world models, serving as a fast surrogate simulator, creating a dream environment where a reinforcement learning (RL) agent explores and optimizes its bucket-filling behavior. The trained agent is then deployed on a full-size wheel loader without modifications, demonstrating its ability to outperform the previous benchmark controller, which was synthesized using imitation learning. Additionally, the same performance was observed as that of a controller pre-trained with imitation learning and optimized on the test pile using RL.</div></div>\",\"PeriodicalId\":8660,\"journal\":{\"name\":\"Automation in Construction\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2024-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automation in Construction\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0926580524005405\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CONSTRUCTION & BUILDING TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automation in Construction","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926580524005405","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
Optimizing bucket-filling strategies for wheel loaders inside a dream environment
Reinforcement Learning (RL) requires many interactions with the environment to converge to an optimal strategy, which makes it unfeasible to apply to wheel loaders and the bucket filling problem without using simulators. However, it is difficult to model the pile dynamics in the simulator because of unknown parameters, which results in poor transferability from the simulation to the real environment. Instead, this paper uses world models, serving as a fast surrogate simulator, creating a dream environment where a reinforcement learning (RL) agent explores and optimizes its bucket-filling behavior. The trained agent is then deployed on a full-size wheel loader without modifications, demonstrating its ability to outperform the previous benchmark controller, which was synthesized using imitation learning. Additionally, the same performance was observed as that of a controller pre-trained with imitation learning and optimized on the test pile using RL.
期刊介绍:
Automation in Construction is an international journal that focuses on publishing original research papers related to the use of Information Technologies in various aspects of the construction industry. The journal covers topics such as design, engineering, construction technologies, and the maintenance and management of constructed facilities.
The scope of Automation in Construction is extensive and covers all stages of the construction life cycle. This includes initial planning and design, construction of the facility, operation and maintenance, as well as the eventual dismantling and recycling of buildings and engineering structures.