{"title":"Byzantine-Resilient Distributed Bandit Online Optimization in Dynamic Environments","authors":"Mengli Wei;Wenwu Yu;Hongzhe Liu;Duxin Chen","doi":"10.1109/TICPS.2024.3410846","DOIUrl":null,"url":null,"abstract":"We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to \n<inline-formula><tex-math>$\\mathcal {B}$</tex-math></inline-formula>\n Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.","PeriodicalId":100640,"journal":{"name":"IEEE Transactions on Industrial Cyber-Physical Systems","volume":"2 ","pages":"154-165"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10551450/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We consider the constrained multi-agent online optimization problem in dynamic environments that are vulnerable to Byzantine attacks, where some infiltrated agents may deviate from the prescribed update rule and send arbitrary messages. The objective functions are exposed in a bandit form, i.e., only the function value is revealed to each agent at the sampling instance, and held privately by each agent. The agents only exchange information with their neighbors to update decisions, and the collective goal is to minimize the sum of the unattacked agents' objective functions in dynamic environments, where the same function can only be sampled once. To handle this problem, a Byzantine-Resilient Distributed Bandit Online Convex Optimization (BR-DBOCO) algorithm that can tolerate up to
$\mathcal {B}$
Byzantine agents is developed. Specifically, the BR-DBOCO employs the one-point bandit feedback (OPBF) mechanism and state filter to cope with the objective function, which cannot be explicitly expressed in dynamic environments and the arbitrary deviation states caused by Byzantine attacks, respectively. We show that sublinear expected regret is achieved if the accumulative deviation of the comparator sequence also grows sublinearly with a proper exploration parameter. Finally, experimental results are presented to illustrate the effectiveness of the proposed algorithm.