基于强化学习的物联网系统多目标资源调度

IF 1.6 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC
Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura
{"title":"基于强化学习的物联网系统多目标资源调度","authors":"Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura","doi":"10.3390/jlpea12040053","DOIUrl":null,"url":null,"abstract":"IoT embedded systems have multiple objectives that need to be maximized simultaneously. These objectives conflict with each other due to limited resources and tradeoffs that need to be made. This requires multi-objective optimization (MOO) and multiple Pareto-optimal solutions are possible. In such a case, tradeoffs are made w.r.t. a user-defined preference. This work presents a general Multi-objective Reinforcement Learning (MORL) framework for MOO of IoT embedded systems. This framework comprises a general Multi-objective Markov Decision Process (MOMDP) formulation and two novel low-compute MORL algorithms. The algorithms learn policies to tradeoff between multiple objectives using a single preference parameter. We take the energy scheduling problem in general Energy Harvesting Wireless Sensor Nodes (EHWSNs) as a case example in which a sensor node is required to maximize its sensing rate, and transmission performance as well as ensure long-term uninterrupted operation within a very tight energy budget. We simulate single-task and dual-task EHWSN systems to evaluate our framework.. The results demonstrate that our MORL algorithms can learn better policies at lower learning costs and successfully tradeoff between multiple objectives at runtime.","PeriodicalId":38100,"journal":{"name":"Journal of Low Power Electronics and Applications","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-Objective Resource Scheduling for IoT Systems Using Reinforcement Learning\",\"authors\":\"Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura\",\"doi\":\"10.3390/jlpea12040053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"IoT embedded systems have multiple objectives that need to be maximized simultaneously. These objectives conflict with each other due to limited resources and tradeoffs that need to be made. This requires multi-objective optimization (MOO) and multiple Pareto-optimal solutions are possible. In such a case, tradeoffs are made w.r.t. a user-defined preference. This work presents a general Multi-objective Reinforcement Learning (MORL) framework for MOO of IoT embedded systems. This framework comprises a general Multi-objective Markov Decision Process (MOMDP) formulation and two novel low-compute MORL algorithms. The algorithms learn policies to tradeoff between multiple objectives using a single preference parameter. We take the energy scheduling problem in general Energy Harvesting Wireless Sensor Nodes (EHWSNs) as a case example in which a sensor node is required to maximize its sensing rate, and transmission performance as well as ensure long-term uninterrupted operation within a very tight energy budget. We simulate single-task and dual-task EHWSN systems to evaluate our framework.. The results demonstrate that our MORL algorithms can learn better policies at lower learning costs and successfully tradeoff between multiple objectives at runtime.\",\"PeriodicalId\":38100,\"journal\":{\"name\":\"Journal of Low Power Electronics and Applications\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2022-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Low Power Electronics and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jlpea12040053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Low Power Electronics and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jlpea12040053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 1

摘要

物联网嵌入式系统有多个目标,需要同时最大化。由于资源有限和需要进行权衡,这些目标相互冲突。这需要多目标优化(MOO),并且多个Pareto最优解是可能的。在这种情况下,会根据用户定义的偏好进行权衡。本文提出了一个用于物联网嵌入式系统MOO的通用多目标强化学习(MORL)框架。该框架包括一个通用的多目标马尔可夫决策过程(MOMDP)公式和两个新的低计算量MORL算法。算法学习使用单个偏好参数在多个目标之间进行权衡的策略。我们以一般能量采集无线传感器节点(EHWSN)中的能量调度问题为例,其中传感器节点需要最大限度地提高其感测速率和传输性能,并确保在非常紧张的能量预算内长期不间断地运行。我们模拟了单任务和双任务EHWSN系统来评估我们的框架。。结果表明,我们的MORL算法可以以较低的学习成本学习更好的策略,并在运行时成功地在多个目标之间进行权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-Objective Resource Scheduling for IoT Systems Using Reinforcement Learning
IoT embedded systems have multiple objectives that need to be maximized simultaneously. These objectives conflict with each other due to limited resources and tradeoffs that need to be made. This requires multi-objective optimization (MOO) and multiple Pareto-optimal solutions are possible. In such a case, tradeoffs are made w.r.t. a user-defined preference. This work presents a general Multi-objective Reinforcement Learning (MORL) framework for MOO of IoT embedded systems. This framework comprises a general Multi-objective Markov Decision Process (MOMDP) formulation and two novel low-compute MORL algorithms. The algorithms learn policies to tradeoff between multiple objectives using a single preference parameter. We take the energy scheduling problem in general Energy Harvesting Wireless Sensor Nodes (EHWSNs) as a case example in which a sensor node is required to maximize its sensing rate, and transmission performance as well as ensure long-term uninterrupted operation within a very tight energy budget. We simulate single-task and dual-task EHWSN systems to evaluate our framework.. The results demonstrate that our MORL algorithms can learn better policies at lower learning costs and successfully tradeoff between multiple objectives at runtime.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Low Power Electronics and Applications
Journal of Low Power Electronics and Applications Engineering-Electrical and Electronic Engineering
CiteScore
3.60
自引率
14.30%
发文量
57
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信