Backdozer: A Backdoor Detection Methodology for DRL-based Traffic Controllers

Yue Wang, Wenqing Li, Manaar Alam, M. Maniatakos, S. Jabari
{"title":"Backdozer: A Backdoor Detection Methodology for DRL-based Traffic Controllers","authors":"Yue Wang, Wenqing Li, Manaar Alam, M. Maniatakos, S. Jabari","doi":"10.1145/3639828","DOIUrl":null,"url":null,"abstract":"While the advent of Deep Reinforcement Learning (DRL) has substantially improved the efficiency of Autonomous Vehicles (AVs), it makes them vulnerable to backdoor attacks that can potentially cause traffic congestion or even collisions. Backdoor functionality is typically implanted by poisoning training datasets with stealthy malicious data, designed to preserve high accuracy on legitimate inputs while inducing desired misclassifications for specific adversary-selected inputs. Existing countermeasures against backdoors predominantly concentrate on image classification, utilizing image-based properties, rendering these methods inapplicable to the regression tasks of DRL-based AV controllers that rely on continuous sensor data as inputs. In this paper, we introduce the first-ever defense against backdoors on regression tasks of DRL-based models, called Backdozer. Our method systematically extracts more abstract features from representations of training data by projecting them into a specific latent subspace and segregating them into several disjoint groups based on the distribution of legitimate outputs. The key observation of Backdozer is that authentic representations for each group reside in one latent subspace, whereas the incorporation of malicious data impacts that subspace. Backdozer optimizes a sample-wise weight vector for the representations capturing the disparities in projections originating from different groups. We experimentally demonstrate that Backdozer can attain \\(100\\% \\) accuracy in detecting backdoors. We also evaluate its effectiveness against three closely related state-of-the-art defenses.","PeriodicalId":388333,"journal":{"name":"Journal on Autonomous Transportation Systems","volume":"35 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal on Autonomous Transportation Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3639828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

While the advent of Deep Reinforcement Learning (DRL) has substantially improved the efficiency of Autonomous Vehicles (AVs), it makes them vulnerable to backdoor attacks that can potentially cause traffic congestion or even collisions. Backdoor functionality is typically implanted by poisoning training datasets with stealthy malicious data, designed to preserve high accuracy on legitimate inputs while inducing desired misclassifications for specific adversary-selected inputs. Existing countermeasures against backdoors predominantly concentrate on image classification, utilizing image-based properties, rendering these methods inapplicable to the regression tasks of DRL-based AV controllers that rely on continuous sensor data as inputs. In this paper, we introduce the first-ever defense against backdoors on regression tasks of DRL-based models, called Backdozer. Our method systematically extracts more abstract features from representations of training data by projecting them into a specific latent subspace and segregating them into several disjoint groups based on the distribution of legitimate outputs. The key observation of Backdozer is that authentic representations for each group reside in one latent subspace, whereas the incorporation of malicious data impacts that subspace. Backdozer optimizes a sample-wise weight vector for the representations capturing the disparities in projections originating from different groups. We experimentally demonstrate that Backdozer can attain \(100\% \) accuracy in detecting backdoors. We also evaluate its effectiveness against three closely related state-of-the-art defenses.
Backdozer:基于 DRL 的交通控制器的后门检测方法
虽然深度强化学习(DRL)的出现大大提高了自动驾驶汽车(AV)的效率,但它也使自动驾驶汽车容易受到后门攻击,这些攻击可能会导致交通堵塞甚至碰撞。后门功能通常是通过在训练数据集中植入隐蔽的恶意数据来实现的,其目的是在合法输入上保持高准确性,同时在敌方选择的特定输入上诱发所需的错误分类。现有的针对后门的对策主要集中在图像分类上,利用的是基于图像的特性,因此这些方法不适用于基于 DRL 的反车辆地雷控制器的回归任务,因为这种控制器依赖于连续的传感器数据作为输入。在本文中,我们首次针对基于 DRL 模型的回归任务提出了一种名为 Backdozer 的后门防御方法。我们的方法通过将训练数据投射到特定的潜在子空间,并根据合法输出的分布将其分成几个不相交的组,从而系统地从训练数据的表征中提取更抽象的特征。Backdozer 的主要观察点是,每个组的真实表征都位于一个潜在子空间中,而恶意数据的加入则会影响该子空间。Backdozer 优化了样本权重向量,以捕捉来自不同群体的投射中的差异。我们通过实验证明,Backdozer 在检测后门方面可以达到 \(100\% \)的准确率。我们还评估了它与三种密切相关的最先进防御方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信