{"title":"Reconfigurable Embedded Devices Using Reinforcement Learning to Develop Action-Policies","authors":"Alwyn Burger, David W. King, Gregor Schiele","doi":"10.1109/ACSOS49614.2020.00046","DOIUrl":null,"url":null,"abstract":"The size of sensor networks supporting smart cities is ever increasing. Sensor network resiliency becomes vital for critical networks such as emergency response and waste water treatment. One approach is to engineer ‘self-aware’ sensors that can proactively change their component composition in response to changes in work load when critical devices fail. By extension, these devices could anticipate their own termination, such as battery depletion, and offload current tasks onto connected devices. These neighboring devices can then reconFigure themselves to process these tasks, thus avoiding catastrophic network failure. In this article, we present an array of self-aware sensors who use Q-learning to develop a policy that guides device reaction to various environmental stimuli. The novelty lies in the use of field programmable gate arrays embedded on the sensors that take into account internal system state, configuration, and learned state-action pairs, that guide device decisions in order to meet system demands. Experiments show that even relatively simple reward functions develop Q-learning policies that yield positive device behaviors in dynamic environments.","PeriodicalId":310362,"journal":{"name":"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSOS49614.2020.00046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The size of sensor networks supporting smart cities is ever increasing. Sensor network resiliency becomes vital for critical networks such as emergency response and waste water treatment. One approach is to engineer ‘self-aware’ sensors that can proactively change their component composition in response to changes in work load when critical devices fail. By extension, these devices could anticipate their own termination, such as battery depletion, and offload current tasks onto connected devices. These neighboring devices can then reconFigure themselves to process these tasks, thus avoiding catastrophic network failure. In this article, we present an array of self-aware sensors who use Q-learning to develop a policy that guides device reaction to various environmental stimuli. The novelty lies in the use of field programmable gate arrays embedded on the sensors that take into account internal system state, configuration, and learned state-action pairs, that guide device decisions in order to meet system demands. Experiments show that even relatively simple reward functions develop Q-learning policies that yield positive device behaviors in dynamic environments.