Jiacheng Yang;Yuanda Wang;Lu Dong;Lei Xue;Changyin Sun
{"title":"Active Robust Adversarial Reinforcement Learning Under Temporally Coupled Perturbations","authors":"Jiacheng Yang;Yuanda Wang;Lu Dong;Lei Xue;Changyin Sun","doi":"10.1109/TAI.2024.3499938","DOIUrl":null,"url":null,"abstract":"Robust reinforcement learning (RL) aims to improve the generalization of agents under model mismatch. As a major branch of robust RL, adversarial approaches formulate the problem as a zero-sum game in which adversaries seek to apply worst case perturbations to the dynamics. However, the potential constraints of adversarial perturbations are seldom addressed in existing approaches. In this article, we consider temporally coupled settings, where adversarial perturbations change continuously at a bounded rate. This kind of constraint can commonly arise in a variety of real-world situations (e.g., changes in wind speed and ocean currents). We propose a novel robust RL approach, named active robust adversarial RL (ARA-RL), that tackles this problem in an adversarial architecture. First, we introduce a type of RL adversary that generates temporally coupled perturbations on agent actions. Then, we embed a diagnostic module in the RL agent, enabling it to actively detect temporally coupled perturbations in unseen environments. Through adversarial training, the agent seeks to maximize its worst case performance and thus achieve robustness under perturbations. Finally, extensive experiments demonstrate that our proposed approach provides significant robustness against temporally coupled perturbations and outperforms other baselines on several continuous control tasks.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"874-884"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10754649/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Robust reinforcement learning (RL) aims to improve the generalization of agents under model mismatch. As a major branch of robust RL, adversarial approaches formulate the problem as a zero-sum game in which adversaries seek to apply worst case perturbations to the dynamics. However, the potential constraints of adversarial perturbations are seldom addressed in existing approaches. In this article, we consider temporally coupled settings, where adversarial perturbations change continuously at a bounded rate. This kind of constraint can commonly arise in a variety of real-world situations (e.g., changes in wind speed and ocean currents). We propose a novel robust RL approach, named active robust adversarial RL (ARA-RL), that tackles this problem in an adversarial architecture. First, we introduce a type of RL adversary that generates temporally coupled perturbations on agent actions. Then, we embed a diagnostic module in the RL agent, enabling it to actively detect temporally coupled perturbations in unseen environments. Through adversarial training, the agent seeks to maximize its worst case performance and thus achieve robustness under perturbations. Finally, extensive experiments demonstrate that our proposed approach provides significant robustness against temporally coupled perturbations and outperforms other baselines on several continuous control tasks.