Pedro Enrique Iturria-Rivera;Marcel Chenier;Bernard Herscovici;Burak Kantarci;Melike Erol-Kantarci
{"title":"合作与否:针对 Wi-Fi 中空间重用的多臂匪徒转移学习","authors":"Pedro Enrique Iturria-Rivera;Marcel Chenier;Bernard Herscovici;Burak Kantarci;Melike Erol-Kantarci","doi":"10.1109/TMLCN.2024.3371929","DOIUrl":null,"url":null,"abstract":"The exponential increase in the demand for high-performance services such as streaming video and gaming by wireless devices has posed several challenges for Wireless Local Area Networks (WLANs). In the context of Wi-Fi, the newest standards, IEEE 802.11ax, and 802.11be, bring high data rates in dense user deployments. Additionally, they introduce new flexible features in the physical layer, such as dynamic Clear-Channel-Assessment (CCA) thresholds, to improve spatial reuse (SR) in response to radio spectrum scarcity in dense scenarios. In this paper, we formulate the Transmission Power (TP) and CCA configuration problem with the objective of maximizing fairness and minimizing station starvation. We present five main contributions to distributed SR optimization using Multi-Agent Multi-Armed Bandits (MA-MABs). First, we provide regret analysis for the distributed Multi-Agent Contextual MABs (MA-CMABs) proposed in this work. Second, we propose reducing the action space given the large cardinality of action combinations of TP and CCA threshold values per Access Point (AP). Third, we present two deep MA-CMAB algorithms, named Sample Average Uncertainty (SAU)-Coop and SAU-NonCoop, as cooperative and non-cooperative versions to improve SR. Additionally, we analyze the viability of using MA-MABs solutions based on the \n<inline-formula> <tex-math>$\\epsilon $ </tex-math></inline-formula>\n-greedy, Upper Bound Confidence (UCB), and Thompson (TS) techniques. Finally, we propose a deep reinforcement transfer learning technique to improve adaptability in dynamic environments. Simulation results show that cooperation via the SAU-Coop algorithm leads to a 14.7% improvement in cumulative throughput and a 32.5% reduction in Packet Loss Rate (PLR) in comparison to non-cooperative approaches. Under dynamic scenarios, transfer learning mitigates service drops for at least 60% of the total users.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"351-369"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10453622","citationCount":"0","resultStr":"{\"title\":\"Cooperate or Not Cooperate: Transfer Learning With Multi-Armed Bandit for Spatial Reuse in Wi-Fi\",\"authors\":\"Pedro Enrique Iturria-Rivera;Marcel Chenier;Bernard Herscovici;Burak Kantarci;Melike Erol-Kantarci\",\"doi\":\"10.1109/TMLCN.2024.3371929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The exponential increase in the demand for high-performance services such as streaming video and gaming by wireless devices has posed several challenges for Wireless Local Area Networks (WLANs). In the context of Wi-Fi, the newest standards, IEEE 802.11ax, and 802.11be, bring high data rates in dense user deployments. Additionally, they introduce new flexible features in the physical layer, such as dynamic Clear-Channel-Assessment (CCA) thresholds, to improve spatial reuse (SR) in response to radio spectrum scarcity in dense scenarios. In this paper, we formulate the Transmission Power (TP) and CCA configuration problem with the objective of maximizing fairness and minimizing station starvation. We present five main contributions to distributed SR optimization using Multi-Agent Multi-Armed Bandits (MA-MABs). First, we provide regret analysis for the distributed Multi-Agent Contextual MABs (MA-CMABs) proposed in this work. Second, we propose reducing the action space given the large cardinality of action combinations of TP and CCA threshold values per Access Point (AP). Third, we present two deep MA-CMAB algorithms, named Sample Average Uncertainty (SAU)-Coop and SAU-NonCoop, as cooperative and non-cooperative versions to improve SR. Additionally, we analyze the viability of using MA-MABs solutions based on the \\n<inline-formula> <tex-math>$\\\\epsilon $ </tex-math></inline-formula>\\n-greedy, Upper Bound Confidence (UCB), and Thompson (TS) techniques. Finally, we propose a deep reinforcement transfer learning technique to improve adaptability in dynamic environments. Simulation results show that cooperation via the SAU-Coop algorithm leads to a 14.7% improvement in cumulative throughput and a 32.5% reduction in Packet Loss Rate (PLR) in comparison to non-cooperative approaches. Under dynamic scenarios, transfer learning mitigates service drops for at least 60% of the total users.\",\"PeriodicalId\":100641,\"journal\":{\"name\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"volume\":\"2 \",\"pages\":\"351-369\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10453622\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10453622/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10453622/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
无线设备对流媒体视频和游戏等高性能服务的需求呈指数级增长,这给无线局域网(WLAN)带来了诸多挑战。就 Wi-Fi 而言,最新标准 IEEE 802.11ax 和 802.11be 为密集用户部署带来了高数据传输速率。此外,它们还在物理层引入了新的灵活功能,如动态净信道评估(CCA)阈值,以改善空间重用(SR),应对密集场景中的无线电频谱稀缺问题。在本文中,我们提出了传输功率(TP)和 CCA 配置问题,其目标是最大限度地提高公平性和最小化站点饥饿。我们提出了使用多代理多臂匪徒(MA-MABs)进行分布式 SR 优化的五大贡献。首先,我们为本研究中提出的分布式多代理上下文 MABs(MA-CMABs)提供了遗憾分析。其次,考虑到每个接入点(AP)的 TP 和 CCA 门限值的行动组合具有很大的卡方性,我们建议缩小行动空间。第三,我们提出了两种深度 MA-CMAB 算法,分别命名为样本平均不确定性 (SAU)-Coop 和 SAU-NonCoop,作为改进 SR 的合作和非合作版本。此外,我们还分析了基于$\epsilon$-greedy、Upper Bound Confidence(UCB)和Thompson(TS)技术的MA-MABs解决方案的可行性。最后,我们提出了一种深度强化迁移学习技术,以提高动态环境中的适应性。仿真结果表明,与非合作方法相比,通过 SAU-Coop 算法进行合作,累计吞吐量提高了 14.7%,丢包率降低了 32.5%。在动态场景下,转移学习至少能缓解总用户数 60% 的服务下降。
Cooperate or Not Cooperate: Transfer Learning With Multi-Armed Bandit for Spatial Reuse in Wi-Fi
The exponential increase in the demand for high-performance services such as streaming video and gaming by wireless devices has posed several challenges for Wireless Local Area Networks (WLANs). In the context of Wi-Fi, the newest standards, IEEE 802.11ax, and 802.11be, bring high data rates in dense user deployments. Additionally, they introduce new flexible features in the physical layer, such as dynamic Clear-Channel-Assessment (CCA) thresholds, to improve spatial reuse (SR) in response to radio spectrum scarcity in dense scenarios. In this paper, we formulate the Transmission Power (TP) and CCA configuration problem with the objective of maximizing fairness and minimizing station starvation. We present five main contributions to distributed SR optimization using Multi-Agent Multi-Armed Bandits (MA-MABs). First, we provide regret analysis for the distributed Multi-Agent Contextual MABs (MA-CMABs) proposed in this work. Second, we propose reducing the action space given the large cardinality of action combinations of TP and CCA threshold values per Access Point (AP). Third, we present two deep MA-CMAB algorithms, named Sample Average Uncertainty (SAU)-Coop and SAU-NonCoop, as cooperative and non-cooperative versions to improve SR. Additionally, we analyze the viability of using MA-MABs solutions based on the
$\epsilon $
-greedy, Upper Bound Confidence (UCB), and Thompson (TS) techniques. Finally, we propose a deep reinforcement transfer learning technique to improve adaptability in dynamic environments. Simulation results show that cooperation via the SAU-Coop algorithm leads to a 14.7% improvement in cumulative throughput and a 32.5% reduction in Packet Loss Rate (PLR) in comparison to non-cooperative approaches. Under dynamic scenarios, transfer learning mitigates service drops for at least 60% of the total users.