{"title":"通过强化学习确保边缘计算中物联网设备的安全","authors":"Anit Kumar , Dhanpratap Singh","doi":"10.1016/j.cose.2025.104474","DOIUrl":null,"url":null,"abstract":"<div><div>The exponentially increasing demand for IoT devices with the expectation of maximum fulfillment of the user needs to bring the integration of the Edger server on the premise of the IoT devices. The small size but the need for complex computation and high-end software requires the amount of additional hardware setup that can never be possible with the absence of an Edge server. Since the Edger server continuously gathers the data from the IoT device for further computation and permanent storage in either local storage or a cloud server, it attracts intruders to try to steal sensitive data of the IoT devices from the Edge server. With the presence of many artificial intelligence tools, an intruder can make serious attacks on the Edger server by breaking its security boundaries. Any individual autonomous entity like a robot, satellite, or self-driving vehicle has a set of interconnected IoT devices (sensors) to form a network, which needs to be so flexible that any new IoT device can easily be integrated into this network without any major difficulties. None of the organizations has ever adopted non-scalable IoT networks. To counter such security challenges, we propose a scalable, robust, and reliable Novel Reinforcement Learning approach having a proper task scheduling mechanism that is powered by using the epsilon-greedy search Q-learning method. The novelty of our proposed method is its high performance which allows the agent to take actions at the time only when it finds a noticeable drop in the network performance in terms of packet delivery ratio, average throughput, and end-to-end delay hyperparameters. Experiments carried out by us along with simulation and real datasets, prove that our proposed security method provides outstanding results as compared to other security approaches discussed in this paper and can counter malicious attacks efficiently. Once our security model gets trained with a threshold amount of times, then after this threshold time, we observe that no benign data packets are lost even with the presence of any external threats and always provide stable communication to the end users. The proposed novel reinforcement learning method is more consistent, resilient, scalable, and accurate than other similar machine learning-based security methods and always has a false positive rate of <2 %.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"155 ","pages":"Article 104474"},"PeriodicalIF":4.8000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Securing IoT devices in edge computing through reinforcement learning\",\"authors\":\"Anit Kumar , Dhanpratap Singh\",\"doi\":\"10.1016/j.cose.2025.104474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The exponentially increasing demand for IoT devices with the expectation of maximum fulfillment of the user needs to bring the integration of the Edger server on the premise of the IoT devices. The small size but the need for complex computation and high-end software requires the amount of additional hardware setup that can never be possible with the absence of an Edge server. Since the Edger server continuously gathers the data from the IoT device for further computation and permanent storage in either local storage or a cloud server, it attracts intruders to try to steal sensitive data of the IoT devices from the Edge server. With the presence of many artificial intelligence tools, an intruder can make serious attacks on the Edger server by breaking its security boundaries. Any individual autonomous entity like a robot, satellite, or self-driving vehicle has a set of interconnected IoT devices (sensors) to form a network, which needs to be so flexible that any new IoT device can easily be integrated into this network without any major difficulties. None of the organizations has ever adopted non-scalable IoT networks. To counter such security challenges, we propose a scalable, robust, and reliable Novel Reinforcement Learning approach having a proper task scheduling mechanism that is powered by using the epsilon-greedy search Q-learning method. The novelty of our proposed method is its high performance which allows the agent to take actions at the time only when it finds a noticeable drop in the network performance in terms of packet delivery ratio, average throughput, and end-to-end delay hyperparameters. Experiments carried out by us along with simulation and real datasets, prove that our proposed security method provides outstanding results as compared to other security approaches discussed in this paper and can counter malicious attacks efficiently. Once our security model gets trained with a threshold amount of times, then after this threshold time, we observe that no benign data packets are lost even with the presence of any external threats and always provide stable communication to the end users. The proposed novel reinforcement learning method is more consistent, resilient, scalable, and accurate than other similar machine learning-based security methods and always has a false positive rate of <2 %.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"155 \",\"pages\":\"Article 104474\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404825001634\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825001634","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Securing IoT devices in edge computing through reinforcement learning
The exponentially increasing demand for IoT devices with the expectation of maximum fulfillment of the user needs to bring the integration of the Edger server on the premise of the IoT devices. The small size but the need for complex computation and high-end software requires the amount of additional hardware setup that can never be possible with the absence of an Edge server. Since the Edger server continuously gathers the data from the IoT device for further computation and permanent storage in either local storage or a cloud server, it attracts intruders to try to steal sensitive data of the IoT devices from the Edge server. With the presence of many artificial intelligence tools, an intruder can make serious attacks on the Edger server by breaking its security boundaries. Any individual autonomous entity like a robot, satellite, or self-driving vehicle has a set of interconnected IoT devices (sensors) to form a network, which needs to be so flexible that any new IoT device can easily be integrated into this network without any major difficulties. None of the organizations has ever adopted non-scalable IoT networks. To counter such security challenges, we propose a scalable, robust, and reliable Novel Reinforcement Learning approach having a proper task scheduling mechanism that is powered by using the epsilon-greedy search Q-learning method. The novelty of our proposed method is its high performance which allows the agent to take actions at the time only when it finds a noticeable drop in the network performance in terms of packet delivery ratio, average throughput, and end-to-end delay hyperparameters. Experiments carried out by us along with simulation and real datasets, prove that our proposed security method provides outstanding results as compared to other security approaches discussed in this paper and can counter malicious attacks efficiently. Once our security model gets trained with a threshold amount of times, then after this threshold time, we observe that no benign data packets are lost even with the presence of any external threats and always provide stable communication to the end users. The proposed novel reinforcement learning method is more consistent, resilient, scalable, and accurate than other similar machine learning-based security methods and always has a false positive rate of <2 %.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.