基于变压器深度逆强化学习的安全可解释的类人规划

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Automation Science and Engineering Pub Date : 2025-02-05 DOI:10.1109/TASE.2025.3539340

Jiangfeng Nan;Ruzheng Zhang;Guodong Yin;Weichao Zhuang;Yilong Zhang;Weiwen Deng

{"title":"基于变压器深度逆强化学习的安全可解释的类人规划","authors":"Jiangfeng Nan;Ruzheng Zhang;Guodong Yin;Weichao Zhuang;Yilong Zhang;Weiwen Deng","doi":"10.1109/TASE.2025.3539340","DOIUrl":null,"url":null,"abstract":"Human-like decision-making and planning are crucial for advancing the decision-making level of autonomous driving and increasing acceptance in the autonomous vehicle market, as well as for achieving data closed loop for autonomous driving. However, human-like decision-making and planning methods still face challenges in safety and interpretability, particularly in multi-vehicle interaction scenarios. In light of this, this paper proposes an interpretable human-like decision-making and planning method with Transformer-based deep inverse reinforcement learning. The proposed method employs a Transformer encoder to extract features from the scenario and determine the attention assigned by the ego vehicle to each traffic vehicle, thereby improving the interpretability of planning outcomes. Furthermore, for improved safety in planning, the model is trained on both positive and negative expert demonstrations. The experimental results show that the proposed method enhances model safety while maintaining imitation levels compared to conventional methods. Additionally, the attention allocation results closely align with those of human drivers, indicating the model’s ability to elucidate the importance of each traffic vehicle for decision-making and planning, thereby improving interpretability. Therefore, the proposed method not only ensures high levels of imitation and safety but also enhances interpretability by providing accurate attention allocation results for decision-making and planning. Note to Practitioners—This paper presents a method for enhancing the planning of autonomous vehicles by making it more interpretable and safer. Using Transformer-based deep reinforcement learning, the approach improves clarity by showing how the vehicle prioritizes other traffic participants and learning from both positive and negative examples. This not only enhances safety and decision accuracy but also provides insights into the vehicle’s reasoning process, which is crucial for debugging and increasing user trust. Future work could focus on adapting this method for even more complex driving scenarios.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12134-12146"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safe and Interpretable Human-Like Planning With Transformer-Based Deep Inverse Reinforcement Learning for Autonomous Driving\",\"authors\":\"Jiangfeng Nan;Ruzheng Zhang;Guodong Yin;Weichao Zhuang;Yilong Zhang;Weiwen Deng\",\"doi\":\"10.1109/TASE.2025.3539340\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human-like decision-making and planning are crucial for advancing the decision-making level of autonomous driving and increasing acceptance in the autonomous vehicle market, as well as for achieving data closed loop for autonomous driving. However, human-like decision-making and planning methods still face challenges in safety and interpretability, particularly in multi-vehicle interaction scenarios. In light of this, this paper proposes an interpretable human-like decision-making and planning method with Transformer-based deep inverse reinforcement learning. The proposed method employs a Transformer encoder to extract features from the scenario and determine the attention assigned by the ego vehicle to each traffic vehicle, thereby improving the interpretability of planning outcomes. Furthermore, for improved safety in planning, the model is trained on both positive and negative expert demonstrations. The experimental results show that the proposed method enhances model safety while maintaining imitation levels compared to conventional methods. Additionally, the attention allocation results closely align with those of human drivers, indicating the model’s ability to elucidate the importance of each traffic vehicle for decision-making and planning, thereby improving interpretability. Therefore, the proposed method not only ensures high levels of imitation and safety but also enhances interpretability by providing accurate attention allocation results for decision-making and planning. Note to Practitioners—This paper presents a method for enhancing the planning of autonomous vehicles by making it more interpretable and safer. Using Transformer-based deep reinforcement learning, the approach improves clarity by showing how the vehicle prioritizes other traffic participants and learning from both positive and negative examples. This not only enhances safety and decision accuracy but also provides insights into the vehicle’s reasoning process, which is crucial for debugging and increasing user trust. Future work could focus on adapting this method for even more complex driving scenarios.\",\"PeriodicalId\":51060,\"journal\":{\"name\":\"IEEE Transactions on Automation Science and Engineering\",\"volume\":\"22 \",\"pages\":\"12134-12146\"},\"PeriodicalIF\":6.4000,\"publicationDate\":\"2025-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automation Science and Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10876190/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10876190/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

仿人的决策和规划对于提高自动驾驶的决策水平和提高自动驾驶汽车市场的接受度，实现自动驾驶的数据闭环至关重要。然而，类人决策和规划方法在安全性和可解释性方面仍然面临挑战，特别是在多车交互场景下。鉴于此，本文提出了一种基于transformer的深度逆强化学习的可解释类人决策与规划方法。该方法采用变压器编码器从场景中提取特征，并确定自我车辆对每个交通车辆的关注程度，从而提高规划结果的可解释性。此外，为了提高规划的安全性，该模型同时接受了正面和负面专家论证的训练。实验结果表明，与传统方法相比，该方法在保持模型仿真水平的同时提高了模型的安全性。此外，注意力分配结果与人类驾驶员的结果密切一致，表明该模型能够阐明每辆交通车辆对决策和规划的重要性，从而提高可解释性。因此，该方法不仅保证了高水平的模仿和安全性，而且通过为决策和规划提供准确的注意力分配结果，提高了可解释性。从业人员注意：本文提出了一种方法，通过使其更具可解释性和安全性来增强自动驾驶汽车的规划。使用基于transformer的深度强化学习，该方法通过显示车辆如何优先考虑其他交通参与者，并从积极和消极的例子中学习，提高了清晰度。这不仅可以提高安全性和决策准确性，还可以深入了解车辆的推理过程，这对于调试和增加用户信任至关重要。未来的工作可能侧重于将这种方法应用于更复杂的驾驶场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Safe and Interpretable Human-Like Planning With Transformer-Based Deep Inverse Reinforcement Learning for Autonomous Driving

Human-like decision-making and planning are crucial for advancing the decision-making level of autonomous driving and increasing acceptance in the autonomous vehicle market, as well as for achieving data closed loop for autonomous driving. However, human-like decision-making and planning methods still face challenges in safety and interpretability, particularly in multi-vehicle interaction scenarios. In light of this, this paper proposes an interpretable human-like decision-making and planning method with Transformer-based deep inverse reinforcement learning. The proposed method employs a Transformer encoder to extract features from the scenario and determine the attention assigned by the ego vehicle to each traffic vehicle, thereby improving the interpretability of planning outcomes. Furthermore, for improved safety in planning, the model is trained on both positive and negative expert demonstrations. The experimental results show that the proposed method enhances model safety while maintaining imitation levels compared to conventional methods. Additionally, the attention allocation results closely align with those of human drivers, indicating the model’s ability to elucidate the importance of each traffic vehicle for decision-making and planning, thereby improving interpretability. Therefore, the proposed method not only ensures high levels of imitation and safety but also enhances interpretability by providing accurate attention allocation results for decision-making and planning. Note to Practitioners—This paper presents a method for enhancing the planning of autonomous vehicles by making it more interpretable and safer. Using Transformer-based deep reinforcement learning, the approach improves clarity by showing how the vehicle prioritizes other traffic participants and learning from both positive and negative examples. This not only enhances safety and decision accuracy but also provides insights into the vehicle’s reasoning process, which is crucial for debugging and increasing user trust. Future work could focus on adapting this method for even more complex driving scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.