动态环境中的拜占庭弹性分布式匪徒在线优化

Mengli Wei;Wenwu Yu;Hongzhe Liu;Duxin Chen
{"title":"动态环境中的拜占庭弹性分布式匪徒在线优化","authors":"Mengli Wei;Wenwu Yu;Hongzhe Liu;Duxin Chen","doi":"10.1109/TICPS.2024.3410846","DOIUrl":null,"url":null,"abstract":"We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to \n<inline-formula><tex-math>$\\mathcal {B}$</tex-math></inline-formula>\n Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.","PeriodicalId":100640,"journal":{"name":"IEEE Transactions on Industrial Cyber-Physical Systems","volume":"2 ","pages":"154-165"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Byzantine-Resilient Distributed Bandit Online Optimization in Dynamic Environments\",\"authors\":\"Mengli Wei;Wenwu Yu;Hongzhe Liu;Duxin Chen\",\"doi\":\"10.1109/TICPS.2024.3410846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to \\n<inline-formula><tex-math>$\\\\mathcal {B}$</tex-math></inline-formula>\\n Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.\",\"PeriodicalId\":100640,\"journal\":{\"name\":\"IEEE Transactions on Industrial Cyber-Physical Systems\",\"volume\":\"2 \",\"pages\":\"154-165\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Industrial Cyber-Physical Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10551450/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10551450/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们考虑的是易受拜占庭攻击的动态环境中的受限多代理在线优化问题,在这种环境中,一些被渗透的代理可能会偏离规定的更新规则,发送任意信息。目标函数以强盗形式暴露,即在采样实例中只向每个代理透露函数值,并由每个代理私下持有。代理只与邻居交换信息以更新决策,集体目标是在动态环境中最小化未受攻击代理的目标函数之和,而在动态环境中,同一函数只能采样一次。为了处理这个问题,我们开发了一种拜占庭弹性分布式匪徒在线凸优化(BR-DBOCO)算法,它可以容忍多达 $\mathcal {B}$ 的拜占庭代理。具体来说,BR-DBOCO 采用了单点匪徒反馈(OPBF)机制和状态过滤器来分别应对无法在动态环境中明确表达的目标函数和拜占庭攻击导致的任意偏差状态。我们证明,如果比较器序列的累积偏差也随着适当的探索参数呈亚线性增长,那么就能实现亚线性预期遗憾。最后,我们给出了实验结果,以说明所提算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Byzantine-Resilient Distributed Bandit Online Optimization in Dynamic Environments
We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to $\mathcal {B}$ Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信