Deep Reinforcement Learning for Over-the-Air Federated Learning in SWIPT-Enabled IoT Networks

Xinran Zhang, Hui Tian, Wanli Ni, Mengying Sun
{"title":"Deep Reinforcement Learning for Over-the-Air Federated Learning in SWIPT-Enabled IoT Networks","authors":"Xinran Zhang, Hui Tian, Wanli Ni, Mengying Sun","doi":"10.1109/VTC2022-Fall57202.2022.10012702","DOIUrl":null,"url":null,"abstract":"As a distributed machine learning paradigm, federated learning (FL) has been regarded as a promising candidate to preserve user privacy in Internet of Things (IoT) networks. Leveraging the waveform superposition property of wireless channels, over-the-air FL (AirFL) achieves fast model aggregation by integrating communication and computation via concurrent analog transmissions. To support sustainable AirFL among energy-constrained IoT devices, we consider that the base station (BS) adopts simultaneous wireless information and power transfer (SWIPT) to distribute global model and charge local devices in each communication round. To maximize the long-term energy efficiency (EE) of AirFL, we investigate a resource allocation problem by jointly optimizing the time division, transceiver beamforming, and power splitting in SWIPT-enabled IoT networks. Considering such multiple closely-coupled continuous valuables, we propose a deep reinforcement learning (DRL) algorithm based on twin delayed deep deterministic (TD3) policy to smartly make downlink and uplink communication strategies with the coordination between the BS and devices. Simulation results show that the proposed TD3 algorithm obtains about 41% EE improvement compared to traditional optimization method and other DRL algorithms.","PeriodicalId":326047,"journal":{"name":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VTC2022-Fall57202.2022.10012702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

As a distributed machine learning paradigm, federated learning (FL) has been regarded as a promising candidate to preserve user privacy in Internet of Things (IoT) networks. Leveraging the waveform superposition property of wireless channels, over-the-air FL (AirFL) achieves fast model aggregation by integrating communication and computation via concurrent analog transmissions. To support sustainable AirFL among energy-constrained IoT devices, we consider that the base station (BS) adopts simultaneous wireless information and power transfer (SWIPT) to distribute global model and charge local devices in each communication round. To maximize the long-term energy efficiency (EE) of AirFL, we investigate a resource allocation problem by jointly optimizing the time division, transceiver beamforming, and power splitting in SWIPT-enabled IoT networks. Considering such multiple closely-coupled continuous valuables, we propose a deep reinforcement learning (DRL) algorithm based on twin delayed deep deterministic (TD3) policy to smartly make downlink and uplink communication strategies with the coordination between the BS and devices. Simulation results show that the proposed TD3 algorithm obtains about 41% EE improvement compared to traditional optimization method and other DRL algorithms.
支持swift的物联网网络中无线联合学习的深度强化学习
作为一种分布式机器学习范式,联邦学习(FL)被认为是保护物联网(IoT)网络中用户隐私的一个有前途的候选。AirFL (over- AirFL)利用无线信道的波形叠加特性,通过并行模拟传输将通信和计算集成在一起,实现快速的模型聚合。为了支持能源受限的物联网设备之间的可持续AirFL,我们认为基站(BS)采用同步无线信息和电力传输(SWIPT)在每一轮通信中分发全局模型并为本地设备充电。为了最大限度地提高AirFL的长期能源效率(EE),我们通过共同优化支持swift的物联网网络中的时分、收发器波束形成和功率分割来研究资源分配问题。考虑到这种多紧耦合的连续值,我们提出了一种基于双延迟深度确定性(TD3)策略的深度强化学习(DRL)算法,在BS与设备之间的协调下,智能地制定上下行通信策略。仿真结果表明,与传统优化方法和其他DRL算法相比,提出的TD3算法的EE提高了约41%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信