PE-RLHF: Reinforcement Learning with Human Feedback and physics knowledge for safe and trustworthy autonomous driving

IF 7.6 1区工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY

Transportation Research Part C-Emerging Technologies Pub Date : 2025-07-19 DOI:10.1016/j.trc.2025.105262

Zilin Huang, Zihao Sheng, Sikai Chen

{"title":"PE-RLHF: Reinforcement Learning with Human Feedback and physics knowledge for safe and trustworthy autonomous driving","authors":"Zilin Huang, Zihao Sheng, Sikai Chen","doi":"10.1016/j.trc.2025.105262","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of autonomous driving, developing safe and trustworthy autonomous driving policies remains a significant challenge. Recently, Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Nevertheless, existing RLHF-enabled methods in the autonomous driving domain often falter when faced with imperfect human demonstrations, potentially leading to training oscillations or even worse performance than rule-based approaches. Inspired by the human learning process, we propose <strong>Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF)</strong>. This novel framework synergistically integrates human feedback (e.g., human intervention) and physics knowledge (e.g., traffic flow model) into the training loop of reinforcement learning. The key advantage of PE-RLHF is that the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates, thus ensuring trustworthy safety improvements. PE-RLHF introduces a Physics-enhanced Human-AI (PE-HAI) collaborative paradigm for dynamic action selection between human and physics-based actions, employs a reward-free approach with a proxy value function to capture human preferences, and incorporates a minimal intervention mechanism to reduce the cognitive load on human mentors. Extensive experiments across diverse driving scenarios demonstrate that PE-RLHF significantly outperforms traditional methods, achieving state-of-the-art (SOTA) performance in safety, efficiency, and generalizability, even with varying quality of human feedback. The philosophy behind PE-RLHF not only advances autonomous driving technology but can also offer valuable insights for other safety-critical domains. Demo video and code are available at: <span><span>https://zilin-huang.github.io/PE-RLHF-website/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"179 ","pages":"Article 105262"},"PeriodicalIF":7.6000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25002669","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of autonomous driving, developing safe and trustworthy autonomous driving policies remains a significant challenge. Recently, Reinforcement Learning with Human Feedback (RLHF) has attracted substantial attention due to its potential to enhance training safety and sampling efficiency. Nevertheless, existing RLHF-enabled methods in the autonomous driving domain often falter when faced with imperfect human demonstrations, potentially leading to training oscillations or even worse performance than rule-based approaches. Inspired by the human learning process, we propose Physics-enhanced Reinforcement Learning with Human Feedback (PE-RLHF). This novel framework synergistically integrates human feedback (e.g., human intervention) and physics knowledge (e.g., traffic flow model) into the training loop of reinforcement learning. The key advantage of PE-RLHF is that the learned policy will perform at least as well as the given physics-based policy, even when human feedback quality deteriorates, thus ensuring trustworthy safety improvements. PE-RLHF introduces a Physics-enhanced Human-AI (PE-HAI) collaborative paradigm for dynamic action selection between human and physics-based actions, employs a reward-free approach with a proxy value function to capture human preferences, and incorporates a minimal intervention mechanism to reduce the cognitive load on human mentors. Extensive experiments across diverse driving scenarios demonstrate that PE-RLHF significantly outperforms traditional methods, achieving state-of-the-art (SOTA) performance in safety, efficiency, and generalizability, even with varying quality of human feedback. The philosophy behind PE-RLHF not only advances autonomous driving technology but can also offer valuable insights for other safety-critical domains. Demo video and code are available at: https://zilin-huang.github.io/PE-RLHF-website/.

查看原文本刊更多论文

PE-RLHF：基于人类反馈和物理知识的强化学习，实现安全可靠的自动驾驶

在自动驾驶领域，制定安全可靠的自动驾驶政策仍然是一个重大挑战。近年来，基于人类反馈的强化学习（RLHF）因其具有提高训练安全性和采样效率的潜力而引起了广泛的关注。然而，在自动驾驶领域，现有的支持rlhf的方法在面对不完美的人类演示时往往会动摇，这可能导致训练振荡，甚至比基于规则的方法表现更差。受人类学习过程的启发，我们提出了基于人类反馈的物理增强强化学习（PE-RLHF）。这个新框架将人类反馈（例如，人类干预）和物理知识（例如，交通流模型）协同集成到强化学习的训练循环中。PE-RLHF的关键优势在于，即使在人类反馈质量恶化的情况下，学习策略的表现至少与给定的基于物理的策略一样好，从而确保了可靠的安全性改进。PE-RLHF引入了一个物理增强的人类-人工智能（PE-HAI）协作范式，用于在人类和基于物理的行动之间进行动态行动选择，采用无奖励方法和代理价值函数来捕捉人类偏好，并结合最小干预机制来减少人类导师的认知负荷。在不同驾驶场景下进行的大量实验表明，PE-RLHF在安全性、效率和通用性方面明显优于传统方法，即使在人类反馈质量不同的情况下，也能实现最先进（SOTA）的性能。PE-RLHF背后的理念不仅推动了自动驾驶技术的发展，还可以为其他安全关键领域提供有价值的见解。演示视频和代码可在：https://zilin-huang.github.io/PE-RLHF-website/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Transportation Research Part C-Emerging Technologies 工程技术-运输科技

CiteScore

15.80

自引率

12.00%

发文量

332

审稿时长

64 days

期刊介绍： Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.