强化学习的安全边际

2023 IEEE Conference on Artificial Intelligence (CAI) Pub Date : 2023-06-01 DOI:10.1109/CAI54212.2023.00026

Alexander Grushin, Walt Woods, Alvaro Velasquez, Simon Khan

{"title":"强化学习的安全边际","authors":"Alexander Grushin, Walt Woods, Alvaro Velasquez, Simon Khan","doi":"10.1109/CAI54212.2023.00026","DOIUrl":null,"url":null,"abstract":"Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent’s situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.","PeriodicalId":129324,"journal":{"name":"2023 IEEE Conference on Artificial Intelligence (CAI)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Safety Margins for Reinforcement Learning\",\"authors\":\"Alexander Grushin, Walt Woods, Alvaro Velasquez, Simon Khan\",\"doi\":\"10.1109/CAI54212.2023.00026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent’s situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.\",\"PeriodicalId\":129324,\"journal\":{\"name\":\"2023 IEEE Conference on Artificial Intelligence (CAI)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE Conference on Artificial Intelligence (CAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAI54212.2023.00026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE Conference on Artificial Intelligence (CAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAI54212.2023.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

任何自主控制器在某些情况下都是不安全的。定量识别这些不安全情况何时发生的能力对于及时进行人类监督至关重要，例如在货运应用中。在这项工作中，我们证明了智能体情况的真正临界可以被鲁棒地定义为给定一定数量的随机行为的奖励的平均减少。可以将可实时计算的代理临界度量(即，无需实际模拟随机操作的影响)与真正的临界度量进行比较，并且我们展示了如何利用这些代理度量来生成安全裕度，这直接将潜在不正确操作的后果与总体性能的预期损失联系起来。我们在Atari环境中评估了从APE-X和A3C学习策略的方法，并演示了安全边际如何随着代理接近失败状态而减少。将安全边际集成到监控部署代理的程序中，可以实时识别潜在的灾难性情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Safety Margins for Reinforcement Learning

Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent’s situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE Conference on Artificial Intelligence (CAI)

自引率

0.00%

发文量