{"title":"Lane-changing policy offline reinforcement learning of autonomous vehicles based on BEAR algorithm with support set constraints","authors":"Caixia Huang, Yuxiang Wang, Zhiyong Zhang, Wenming Feng, Dayang Huang","doi":"10.1177/09544070241265752","DOIUrl":null,"url":null,"abstract":"Imitation learning struggles to learn an optimal policy from datasets containing both expert and non-expert samples due to its inability to discern the quality differences between these samples. Furthermore, standard online reinforcement learning (RL) methodologies face significant exploration costs and safety risks during environmental interactions. Addressing these challenges, this study develops a lane-changing model for autonomous vehicles using the bootstrapping error accumulation reduction (BEAR) algorithm. The model initially examines the distributional shifts between behavioral and target policies in offline RL. It then incorporates the BEAR algorithm, enhanced with support set constraints, to mitigate this issue. The study subsequently proposes a lane-changing policy learning method based on the BEAR algorithm in offline RL. This method involves designing the state space, action set, and reward function. The reward function is tailored to guide the autonomous vehicle in executing lane changes while balancing safety, ride comfort, and traffic efficiency. In the final stage, the lane-changing policy is learned using a dataset of both expert and non-expert samples. Test results indicate that the lane-changing policy developed through this method shows higher success rates and safety levels compared to policies derived via imitation learning.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544070241265752","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Imitation learning struggles to learn an optimal policy from datasets containing both expert and non-expert samples due to its inability to discern the quality differences between these samples. Furthermore, standard online reinforcement learning (RL) methodologies face significant exploration costs and safety risks during environmental interactions. Addressing these challenges, this study develops a lane-changing model for autonomous vehicles using the bootstrapping error accumulation reduction (BEAR) algorithm. The model initially examines the distributional shifts between behavioral and target policies in offline RL. It then incorporates the BEAR algorithm, enhanced with support set constraints, to mitigate this issue. The study subsequently proposes a lane-changing policy learning method based on the BEAR algorithm in offline RL. This method involves designing the state space, action set, and reward function. The reward function is tailored to guide the autonomous vehicle in executing lane changes while balancing safety, ride comfort, and traffic efficiency. In the final stage, the lane-changing policy is learned using a dataset of both expert and non-expert samples. Test results indicate that the lane-changing policy developed through this method shows higher success rates and safety levels compared to policies derived via imitation learning.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.