车辆网络快速频谱共享:一种元强化学习方法

Kai Huang, Zezhou Luo, Le Liang, Shi Jin
{"title":"车辆网络快速频谱共享:一种元强化学习方法","authors":"Kai Huang, Zezhou Luo, Le Liang, Shi Jin","doi":"10.1109/VTC2022-Fall57202.2022.10012705","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the resource allocation problem in a dynamic vehicular environment, where multiple vehicle-to-vehicle links attempt to reuse the spectrum of vehicle-to-infrastructure links. It is modeled as a deep reinforcement learning problem that is subject to proximal policy optimization. Training a well-performing policy usually requires a massive amount of interactions with the environment for a long time and thus is typically performed on a simulator. However, an agent well trained in a simulated environment may still fail when deployed in a live network, due to inevitable difference between the two environments, termed reality gap. We make preliminary efforts to address this issue by leveraging meta reinforcement learning that allows the learning agent to quickly adapt to a new environment with minimal interactions after being trained across a variety of similar tasks. We demonstrate that only a few episodes are required for the meta trained policy to adapt to a new environment and the proposed method is shown to achieve near-optimal performance and exhibit rapid convergence.","PeriodicalId":326047,"journal":{"name":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Fast Spectrum Sharing in Vehicular Networks: A Meta Reinforcement Learning Approach\",\"authors\":\"Kai Huang, Zezhou Luo, Le Liang, Shi Jin\",\"doi\":\"10.1109/VTC2022-Fall57202.2022.10012705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we investigate the resource allocation problem in a dynamic vehicular environment, where multiple vehicle-to-vehicle links attempt to reuse the spectrum of vehicle-to-infrastructure links. It is modeled as a deep reinforcement learning problem that is subject to proximal policy optimization. Training a well-performing policy usually requires a massive amount of interactions with the environment for a long time and thus is typically performed on a simulator. However, an agent well trained in a simulated environment may still fail when deployed in a live network, due to inevitable difference between the two environments, termed reality gap. We make preliminary efforts to address this issue by leveraging meta reinforcement learning that allows the learning agent to quickly adapt to a new environment with minimal interactions after being trained across a variety of similar tasks. We demonstrate that only a few episodes are required for the meta trained policy to adapt to a new environment and the proposed method is shown to achieve near-optimal performance and exhibit rapid convergence.\",\"PeriodicalId\":326047,\"journal\":{\"name\":\"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VTC2022-Fall57202.2022.10012705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VTC2022-Fall57202.2022.10012705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在本文中,我们研究了动态车辆环境中的资源分配问题,其中多个车对车链路试图重用车对基础设施链路的频谱。它被建模为一个深度强化学习问题,服从于近端策略优化。训练一个执行良好的策略通常需要与环境进行长时间的大量交互,因此通常在模拟器上执行。然而,在模拟环境中训练良好的代理在实际网络中部署时仍然可能失败,这是由于两种环境之间不可避免的差异,称为现实差距。我们通过利用元强化学习做出了初步的努力来解决这个问题,元强化学习允许学习代理在经过各种类似任务的训练后以最小的交互快速适应新环境。我们证明,元训练策略只需要几集就可以适应新的环境,并且所提出的方法可以达到接近最优的性能并表现出快速收敛。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fast Spectrum Sharing in Vehicular Networks: A Meta Reinforcement Learning Approach
In this paper, we investigate the resource allocation problem in a dynamic vehicular environment, where multiple vehicle-to-vehicle links attempt to reuse the spectrum of vehicle-to-infrastructure links. It is modeled as a deep reinforcement learning problem that is subject to proximal policy optimization. Training a well-performing policy usually requires a massive amount of interactions with the environment for a long time and thus is typically performed on a simulator. However, an agent well trained in a simulated environment may still fail when deployed in a live network, due to inevitable difference between the two environments, termed reality gap. We make preliminary efforts to address this issue by leveraging meta reinforcement learning that allows the learning agent to quickly adapt to a new environment with minimal interactions after being trained across a variety of similar tasks. We demonstrate that only a few episodes are required for the meta trained policy to adapt to a new environment and the proposed method is shown to achieve near-optimal performance and exhibit rapid convergence.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信