{"title":"Optimizing the distribution of tasks in Internet of Things using edge processing-based reinforcement learning","authors":"Mohsen Latifi, Nahideh Derakhshanfard, Hossein Heydari","doi":"10.1016/j.iswa.2025.200585","DOIUrl":null,"url":null,"abstract":"<div><div>As the Internet of Things expands, managing intelligent tasks in dynamic and heterogeneous environments has emerged as a primary challenge for processing-based systems at the network’s edge. A critical issue in this domain is the optimal allocation of tasks. A review of prior studies indicates that many existing approaches either focus on a single objective or suffer from instability and overestimation of decision values during the learning phase. This paper aims to bridge this by proposing an approach that utilizes reinforcement learning with a double Q-learning algorithm and a multi-objective reward function. Furthermore, the designed reward function facilitates intelligent decision-making under more realistic conditions by incorporating three essential factors: task execution delay, energy consumption of edge nodes, and computational load balancing across the nodes. The inputs for the proposed method encompass information such as task sizes, deadlines for each task, remaining energy in the nodes, computational power of the nodes, proximity to the edge nodes, and the current workload of each node. The method's output at any given moment is the decision regarding assigning any task to the most suitable node. Simulation results in a dynamic environment demonstrate that the proposed method outperforms traditional reinforcement learning algorithms. Specifically, the average task execution delay has been reduced by up to 23%, the energy consumption of the nodes has decreased by up to 18%, and load balancing among nodes has improved by up to 27%.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200585"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305325001115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As the Internet of Things expands, managing intelligent tasks in dynamic and heterogeneous environments has emerged as a primary challenge for processing-based systems at the network’s edge. A critical issue in this domain is the optimal allocation of tasks. A review of prior studies indicates that many existing approaches either focus on a single objective or suffer from instability and overestimation of decision values during the learning phase. This paper aims to bridge this by proposing an approach that utilizes reinforcement learning with a double Q-learning algorithm and a multi-objective reward function. Furthermore, the designed reward function facilitates intelligent decision-making under more realistic conditions by incorporating three essential factors: task execution delay, energy consumption of edge nodes, and computational load balancing across the nodes. The inputs for the proposed method encompass information such as task sizes, deadlines for each task, remaining energy in the nodes, computational power of the nodes, proximity to the edge nodes, and the current workload of each node. The method's output at any given moment is the decision regarding assigning any task to the most suitable node. Simulation results in a dynamic environment demonstrate that the proposed method outperforms traditional reinforcement learning algorithms. Specifically, the average task execution delay has been reduced by up to 23%, the energy consumption of the nodes has decreased by up to 18%, and load balancing among nodes has improved by up to 27%.