利用策略梯度强化学习减少EDCA的传输延迟

2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC) Pub Date : 2020-01-01 DOI:10.1109/CCNC46108.2020.9045621

Masao Shinzaki, Yusuke Koda, Koji Yamamoto, T. Nishio, M. Morikura

{"title":"利用策略梯度强化学习减少EDCA的传输延迟","authors":"Masao Shinzaki, Yusuke Koda, Koji Yamamoto, T. Nishio, M. Morikura","doi":"10.1109/CCNC46108.2020.9045621","DOIUrl":null,"url":null,"abstract":"Towards ultra-reliable and low-latency communications, this paper proposes a packet mapping algorithm in an enhanced distributed channel access (EDCA) scheme using policy gradient reinforcement learning (RL). The EDCA scheme provides higher priority packets with more transmission opportunities by mapping packets to a predefined access category (AC); thereby, the EDCA scheme supports a higher quality of service in wireless local area networks. In this paper, it is noted that by mapping high priority packets to lower priority ACs, the one-packet delay of a high priority packet can be reduced. In contrast, the mapping algorithm cannot minimize the multiple-packets delay because the mapping algorithm is based on the current status. This is because, from a long-term perspective, mapping high priority packets is required as a countermeasure for collisions, to minimize the multiple-packets delay. As a solution, this paper proposes a new mapping algorithm using RL because RL is suitable for maximizing the reward from a long-term perspective. The key idea is to design the state such that the state involves the number of packets having arrived at each AP in the past, which is an indicator expressing past status. In the designed RL task, the reward, i.e., the multiple-packets delay depends on an overall sequence of states and actions; hence, the recursive value function-based RL algorithms are not compatible. To solve this problem, this paper utilizes policy gradient RL, which learns the packet mapping policy from an overall state-action sequence and a consequent multiple-packets delay. The simulation result reveals that the transmission delay of the proposed mapping algorithm is 13.8% shorter than that of the conventional EDCA mapping algorithm.","PeriodicalId":443862,"journal":{"name":"2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reducing Transmission Delay in EDCA Using Policy Gradient Reinforcement Learning\",\"authors\":\"Masao Shinzaki, Yusuke Koda, Koji Yamamoto, T. Nishio, M. Morikura\",\"doi\":\"10.1109/CCNC46108.2020.9045621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Towards ultra-reliable and low-latency communications, this paper proposes a packet mapping algorithm in an enhanced distributed channel access (EDCA) scheme using policy gradient reinforcement learning (RL). The EDCA scheme provides higher priority packets with more transmission opportunities by mapping packets to a predefined access category (AC); thereby, the EDCA scheme supports a higher quality of service in wireless local area networks. In this paper, it is noted that by mapping high priority packets to lower priority ACs, the one-packet delay of a high priority packet can be reduced. In contrast, the mapping algorithm cannot minimize the multiple-packets delay because the mapping algorithm is based on the current status. This is because, from a long-term perspective, mapping high priority packets is required as a countermeasure for collisions, to minimize the multiple-packets delay. As a solution, this paper proposes a new mapping algorithm using RL because RL is suitable for maximizing the reward from a long-term perspective. The key idea is to design the state such that the state involves the number of packets having arrived at each AP in the past, which is an indicator expressing past status. In the designed RL task, the reward, i.e., the multiple-packets delay depends on an overall sequence of states and actions; hence, the recursive value function-based RL algorithms are not compatible. To solve this problem, this paper utilizes policy gradient RL, which learns the packet mapping policy from an overall state-action sequence and a consequent multiple-packets delay. The simulation result reveals that the transmission delay of the proposed mapping algorithm is 13.8% shorter than that of the conventional EDCA mapping algorithm.\",\"PeriodicalId\":443862,\"journal\":{\"name\":\"2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCNC46108.2020.9045621\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCNC46108.2020.9045621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

为了实现超可靠和低延迟通信，本文提出了一种基于策略梯度强化学习(RL)的增强型分布式信道访问(EDCA)方案中的分组映射算法。EDCA方案通过将数据包映射到预定义的访问类别(AC)，为数据包提供更高的优先级和更多的传输机会;因此，EDCA方案在无线局域网中支持更高质量的服务。本文指出，通过将高优先级报文映射到低优先级ac，可以减少高优先级报文的单包延迟。而映射算法是基于当前状态的映射算法，不能最大限度地减少多包时延。这是因为，从长远的角度来看，需要映射高优先级数据包作为冲突的对策，以最小化多数据包延迟。为了解决这一问题，本文提出了一种新的基于强化学习的映射算法，因为强化学习适合于从长远角度最大化奖励。关键思想是设计状态，使状态涉及过去到达每个AP的数据包数量，这是表示过去状态的指示器。在设计的强化学习任务中，奖励，即多数据包延迟取决于状态和动作的总体序列;因此，基于递归值函数的RL算法不兼容。为了解决这个问题，本文利用策略梯度强化学习，它从一个整体的状态-动作序列和随之而来的多数据包延迟中学习数据包映射策略。仿真结果表明，该映射算法的传输延迟比传统的EDCA映射算法缩短13.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reducing Transmission Delay in EDCA Using Policy Gradient Reinforcement Learning

Towards ultra-reliable and low-latency communications, this paper proposes a packet mapping algorithm in an enhanced distributed channel access (EDCA) scheme using policy gradient reinforcement learning (RL). The EDCA scheme provides higher priority packets with more transmission opportunities by mapping packets to a predefined access category (AC); thereby, the EDCA scheme supports a higher quality of service in wireless local area networks. In this paper, it is noted that by mapping high priority packets to lower priority ACs, the one-packet delay of a high priority packet can be reduced. In contrast, the mapping algorithm cannot minimize the multiple-packets delay because the mapping algorithm is based on the current status. This is because, from a long-term perspective, mapping high priority packets is required as a countermeasure for collisions, to minimize the multiple-packets delay. As a solution, this paper proposes a new mapping algorithm using RL because RL is suitable for maximizing the reward from a long-term perspective. The key idea is to design the state such that the state involves the number of packets having arrived at each AP in the past, which is an indicator expressing past status. In the designed RL task, the reward, i.e., the multiple-packets delay depends on an overall sequence of states and actions; hence, the recursive value function-based RL algorithms are not compatible. To solve this problem, this paper utilizes policy gradient RL, which learns the packet mapping policy from an overall state-action sequence and a consequent multiple-packets delay. The simulation result reveals that the transmission delay of the proposed mapping algorithm is 13.8% shorter than that of the conventional EDCA mapping algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC)

自引率

0.00%

发文量