{"title":"基于多层奖励的自适应Q学习在有障碍的扩散分子通信环境中的源跟踪","authors":"Zhibo Lou, Wence Zhang, Xu Bao","doi":"10.1016/j.nancom.2023.100478","DOIUrl":null,"url":null,"abstract":"<div><p>Benefiting by the fast development of nanotechnology, molecular communication (MC) has received great attention in recent years. In many potential applications of MC, such as drug delivery and pollution prevention, it is essential to locate or trace the target. In this paper, we consider a 3D diffusive MC environment consisting of several obstacles, a molecule-releasing source (RS) and a mobile molecule sensor (MS) which aims to find the RS within a time constraint. The problem is reformulated using Markov Decision Process (MDP) and an adaptive multi-layer reward based Q-Learning (AMR-Q Learning) approach is proposed. By exploiting information from the number of received molecules and adaptively setting multi-layer rewards, MS with AMR-Q Learning can find the RS efficiently, unlike the gradient based method which is usually trapped in locally optimal points. Numerical results demonstrate that the proposed AMR-Q Learning approach outperforms existing path-planning schemes with significantly reduced training overhead.</p></div>","PeriodicalId":54336,"journal":{"name":"Nano Communication Networks","volume":"38 ","pages":"Article 100478"},"PeriodicalIF":2.9000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive multi-layer reward based Q-learning for source tracing in diffusive molecular communications environment with obstacles\",\"authors\":\"Zhibo Lou, Wence Zhang, Xu Bao\",\"doi\":\"10.1016/j.nancom.2023.100478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Benefiting by the fast development of nanotechnology, molecular communication (MC) has received great attention in recent years. In many potential applications of MC, such as drug delivery and pollution prevention, it is essential to locate or trace the target. In this paper, we consider a 3D diffusive MC environment consisting of several obstacles, a molecule-releasing source (RS) and a mobile molecule sensor (MS) which aims to find the RS within a time constraint. The problem is reformulated using Markov Decision Process (MDP) and an adaptive multi-layer reward based Q-Learning (AMR-Q Learning) approach is proposed. By exploiting information from the number of received molecules and adaptively setting multi-layer rewards, MS with AMR-Q Learning can find the RS efficiently, unlike the gradient based method which is usually trapped in locally optimal points. Numerical results demonstrate that the proposed AMR-Q Learning approach outperforms existing path-planning schemes with significantly reduced training overhead.</p></div>\",\"PeriodicalId\":54336,\"journal\":{\"name\":\"Nano Communication Networks\",\"volume\":\"38 \",\"pages\":\"Article 100478\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nano Communication Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1878778923000443\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nano Communication Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1878778923000443","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Adaptive multi-layer reward based Q-learning for source tracing in diffusive molecular communications environment with obstacles
Benefiting by the fast development of nanotechnology, molecular communication (MC) has received great attention in recent years. In many potential applications of MC, such as drug delivery and pollution prevention, it is essential to locate or trace the target. In this paper, we consider a 3D diffusive MC environment consisting of several obstacles, a molecule-releasing source (RS) and a mobile molecule sensor (MS) which aims to find the RS within a time constraint. The problem is reformulated using Markov Decision Process (MDP) and an adaptive multi-layer reward based Q-Learning (AMR-Q Learning) approach is proposed. By exploiting information from the number of received molecules and adaptively setting multi-layer rewards, MS with AMR-Q Learning can find the RS efficiently, unlike the gradient based method which is usually trapped in locally optimal points. Numerical results demonstrate that the proposed AMR-Q Learning approach outperforms existing path-planning schemes with significantly reduced training overhead.
期刊介绍:
The Nano Communication Networks Journal is an international, archival and multi-disciplinary journal providing a publication vehicle for complete coverage of all topics of interest to those involved in all aspects of nanoscale communication and networking. Theoretical research contributions presenting new techniques, concepts or analyses; applied contributions reporting on experiences and experiments; and tutorial and survey manuscripts are published.
Nano Communication Networks is a part of the COMNET (Computer Networks) family of journals within Elsevier. The family of journals covers all aspects of networking except nanonetworking, which is the scope of this journal.