Adversarial Deep Learning for Online Resource Allocation

IF 0.7 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Bingqian Du, Zhiyi Huang, Chuan Wu
{"title":"Adversarial Deep Learning for Online Resource Allocation","authors":"Bingqian Du, Zhiyi Huang, Chuan Wu","doi":"10.1145/3494526","DOIUrl":null,"url":null,"abstract":"Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 25"},"PeriodicalIF":0.7000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3494526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 4

Abstract

Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.
用于在线资源分配的对抗性深度学习
在线算法是算法设计中的一个重要分支。设计具有有界竞争比(就最坏情况下的性能而言)的在线算法可能很困难,并且通常依赖于特定于问题的假设。受生成对抗性网络的对抗性训练以及在线算法的竞争比基于最坏情况输入的事实的启发,我们采用深度神经网络(NN)从头开始学习资源分配和定价问题的在线算法,目标是对于最坏情况的输入,可以最小化离线最优算法和学习的在线算法之间的性能差距。具体来说,我们分别利用两个NN作为算法和对手,让它们玩零和游戏,对手负责生成最坏情况的输入,而算法则根据对手提供的输入学习最佳策略。为了确保算法网络更好地收敛(到所需的在线算法),我们提出了一种新的每轮更新方法来处理顺序决策,以打破不同轮之间的复杂依赖关系,从而可以对每一个可能的动作进行更新,而不仅仅是采样动作。据我们所知,我们的工作是首次使用深度神经网络从最坏情况性能保证的角度设计在线算法。实证研究表明,我们的更新方法确保了收敛到纳什均衡,并且在各种设置下,所学习的算法优于最先进的在线算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.10
自引率
0.00%
发文量
9
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信