{"title":"优化gps拒绝环境下移动资产位置跟踪的同步时间","authors":"C. M. Bowyer, J. Shea, T. Wong, W. Dixon","doi":"10.1109/GLOBECOM48099.2022.10001565","DOIUrl":null,"url":null,"abstract":"A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.","PeriodicalId":313199,"journal":{"name":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing Synchronization Times for Position Tracking of a Mobile Asset in GPS-denied Environments\",\"authors\":\"C. M. Bowyer, J. Shea, T. Wong, W. Dixon\",\"doi\":\"10.1109/GLOBECOM48099.2022.10001565\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.\",\"PeriodicalId\":313199,\"journal\":{\"name\":\"GLOBECOM 2022 - 2022 IEEE Global Communications Conference\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GLOBECOM 2022 - 2022 IEEE Global Communications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GLOBECOM48099.2022.10001565\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM48099.2022.10001565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimizing Synchronization Times for Position Tracking of a Mobile Asset in GPS-denied Environments
A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.