D2PG: deep deterministic policy gradient based for maximizing network throughput in clustered EH-WSN

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Wireless Networks Pub Date : 2024-05-26 DOI:10.1007/s11276-024-03767-5

Mojtaba Farmani, Saman Farnam, Razieh Mohammadi, Zahra Shirmohammadi

{"title":"D2PG: deep deterministic policy gradient based for maximizing network throughput in clustered EH-WSN","authors":"Mojtaba Farmani, Saman Farnam, Razieh Mohammadi, Zahra Shirmohammadi","doi":"10.1007/s11276-024-03767-5","DOIUrl":null,"url":null,"abstract":"<p>Wireless sensor networks are considered one of the effective technologies in various applications, responsible for monitoring and sensing. In these networks, sensors are powered by batteries with limited energy capacity. Consequently, the required energy for the sensors is obtained from the surrounding environment using energy harvesters. However, these environmental resources are unpredictable, making power management a critical issue that demands careful consideration. Reinforcement Learning (RL) algorithms offer an efficient solution for throughput management in these networks, enabling the adjustment of data rates for nodes based on the network’s energy conditions. Nevertheless, previous throughput management methods based on RL algorithms suffer from one of the key challenges: discretizing the state space does not guarantee the maximum improvement in throughput the network. Therefore, this paper proposes a method called Deep Deterministic Policy Gradient-Based for Maximizing Network Throughput (D2PG), which utilizes a Deep Reinforcement Learning algorithm known as Deep Deterministic Policy Gradient and introduces a novel reward function. This method can lead to maximizing the data transmission rate and enhancing network throughput across the entire network through continuous state space optimization among sensor energy consumption. The D2PG method is evaluated and compared with RL, RL-new, and Deep Q-Network methods, resulting in throughput enhancements of 15.3%, 12.9%, and 5.7%, respectively, in the network’s throughput. Additionally, the new reward function demonstrates superior performance in terms of data rate proportionality concerning the energy level.</p>","PeriodicalId":23750,"journal":{"name":"Wireless Networks","volume":"43 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wireless Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11276-024-03767-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Wireless sensor networks are considered one of the effective technologies in various applications, responsible for monitoring and sensing. In these networks, sensors are powered by batteries with limited energy capacity. Consequently, the required energy for the sensors is obtained from the surrounding environment using energy harvesters. However, these environmental resources are unpredictable, making power management a critical issue that demands careful consideration. Reinforcement Learning (RL) algorithms offer an efficient solution for throughput management in these networks, enabling the adjustment of data rates for nodes based on the network’s energy conditions. Nevertheless, previous throughput management methods based on RL algorithms suffer from one of the key challenges: discretizing the state space does not guarantee the maximum improvement in throughput the network. Therefore, this paper proposes a method called Deep Deterministic Policy Gradient-Based for Maximizing Network Throughput (D2PG), which utilizes a Deep Reinforcement Learning algorithm known as Deep Deterministic Policy Gradient and introduces a novel reward function. This method can lead to maximizing the data transmission rate and enhancing network throughput across the entire network through continuous state space optimization among sensor energy consumption. The D2PG method is evaluated and compared with RL, RL-new, and Deep Q-Network methods, resulting in throughput enhancements of 15.3%, 12.9%, and 5.7%, respectively, in the network’s throughput. Additionally, the new reward function demonstrates superior performance in terms of data rate proportionality concerning the energy level.

Abstract Image

查看原文本刊更多论文

D2PG：基于深度确定性策略梯度的集群 EH-WSN 网络吞吐量最大化算法

无线传感器网络被认为是各种应用中负责监测和传感的有效技术之一。在这些网络中，传感器由能量有限的电池供电。因此，传感器所需的能量是通过能量收集器从周围环境中获取的。然而，这些环境资源是不可预测的，因此电源管理成为一个需要仔细考虑的关键问题。强化学习（RL）算法为这些网络中的吞吐量管理提供了有效的解决方案，可根据网络的能量条件调整节点的数据传输速率。然而，以往基于 RL 算法的吞吐量管理方法面临着一个关键挑战：将状态空间离散化并不能保证网络吞吐量的最大改善。因此，本文提出了一种名为 "基于深度确定性策略梯度的网络吞吐量最大化方法"（Deep Deterministic Policy Gradient-Based for Maximizing Network Throughput，D2PG）的方法，该方法利用了一种名为 "深度确定性策略梯度 "的深度强化学习算法，并引入了一种新颖的奖励函数。这种方法可以通过对传感器能耗进行连续的状态空间优化，最大限度地提高数据传输速率，并增强整个网络的吞吐量。对 D2PG 方法进行了评估，并与 RL、RL-new 和 Deep Q-Network 方法进行了比较，结果发现网络吞吐量分别提高了 15.3%、12.9% 和 5.7%。此外，新奖励函数在有关能量水平的数据速率比例方面也表现出了卓越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Wireless Networks 工程技术-电信学

CiteScore

7.70

自引率

3.30%

发文量

314

审稿时长

5.5 months

期刊介绍： The wireless communication revolution is bringing fundamental changes to data networking, telecommunication, and is making integrated networks a reality. By freeing the user from the cord, personal communications networks, wireless LAN''s, mobile radio networks and cellular systems, harbor the promise of fully distributed mobile computing and communications, any time, anywhere. Focusing on the networking and user aspects of the field, Wireless Networks provides a global forum for archival value contributions documenting these fast growing areas of interest. The journal publishes refereed articles dealing with research, experience and management issues of wireless networks. Its aim is to allow the reader to benefit from experience, problems and solutions described.