Lan Liao,Daniel W C Ho,Deming Yuan,Zhan Yu,Baoyong Zhang,Shengyuan Xu
{"title":"Dynamic Regret of Quantized Distributed Online Bandit Optimization in Zero-Sum Games.","authors":"Lan Liao,Daniel W C Ho,Deming Yuan,Zhan Yu,Baoyong Zhang,Shengyuan Xu","doi":"10.1109/tcyb.2025.3604774","DOIUrl":null,"url":null,"abstract":"This article investigates the distributed online optimization problem in a zero-sum game between two distinct time-varying multiagent networks. At each iteration, the agents not only communicate with their neighbors but also gather information about agents in the opposing network through a time-varying network, assigning weights accordingly. Moreover, we consider quantized communication and bandit feedback mechanisms, with agents transmitting quantized information and adopting one-point estimators. At each iteration, agents make and submit decisions and then receive the cost function values near their decision points rather than the full cost function information. To guarantee the payoff of each network, we design an algorithm named quantized distributed online bandit optimization in two-network (QDOBO-TN). We use dynamic Nash equilibrium regret to measure the positive payoff discrepancy between the decision sequence produced by Algorithm QDOBO-TN and the Nash equilibrium sequence. Furthermore, we propose a multiepoch version of Algorithm QDOBO-TN. The regret bounds for both algorithms are sublinear with respect to the iteration count T. Finally, we conduct a series of simulation experiments that further validate the effectiveness of the algorithms.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"30 1","pages":""},"PeriodicalIF":10.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tcyb.2025.3604774","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This article investigates the distributed online optimization problem in a zero-sum game between two distinct time-varying multiagent networks. At each iteration, the agents not only communicate with their neighbors but also gather information about agents in the opposing network through a time-varying network, assigning weights accordingly. Moreover, we consider quantized communication and bandit feedback mechanisms, with agents transmitting quantized information and adopting one-point estimators. At each iteration, agents make and submit decisions and then receive the cost function values near their decision points rather than the full cost function information. To guarantee the payoff of each network, we design an algorithm named quantized distributed online bandit optimization in two-network (QDOBO-TN). We use dynamic Nash equilibrium regret to measure the positive payoff discrepancy between the decision sequence produced by Algorithm QDOBO-TN and the Nash equilibrium sequence. Furthermore, we propose a multiepoch version of Algorithm QDOBO-TN. The regret bounds for both algorithms are sublinear with respect to the iteration count T. Finally, we conduct a series of simulation experiments that further validate the effectiveness of the algorithms.
期刊介绍:
The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.