优化gps拒绝环境下移动资产位置跟踪的同步时间

C. M. Bowyer, J. Shea, T. Wong, W. Dixon
{"title":"优化gps拒绝环境下移动资产位置跟踪的同步时间","authors":"C. M. Bowyer, J. Shea, T. Wong, W. Dixon","doi":"10.1109/GLOBECOM48099.2022.10001565","DOIUrl":null,"url":null,"abstract":"A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.","PeriodicalId":313199,"journal":{"name":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimizing Synchronization Times for Position Tracking of a Mobile Asset in GPS-denied Environments\",\"authors\":\"C. M. Bowyer, J. Shea, T. Wong, W. Dixon\",\"doi\":\"10.1109/GLOBECOM48099.2022.10001565\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.\",\"PeriodicalId\":313199,\"journal\":{\"name\":\"GLOBECOM 2022 - 2022 IEEE Global Communications Conference\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GLOBECOM 2022 - 2022 IEEE Global Communications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GLOBECOM48099.2022.10001565\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM48099.2022.10001565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在拒绝gps的环境中运行的分布式代理网络的任务是跟踪移动资产的位置。智能体使用从资产的信标信号传输中获得的飞行时间(ToF)测量值来估计资产的位置,但由于智能体本地时钟的时钟漂移,估计结果存在噪声。可以通过使资产和代理网络执行分布式同步过程来减少时钟漂移;然而,同步和定位不能同时执行,因为同步期间时钟状态没有很好地定义,并且由于无线电通信资源的限制。智能体何时应该执行感知或同步的优化问题被表述为具有连续观测的部分观察马尔可夫决策过程(POMDP)。基于表的q学习用于在POMDP的信念空间上搜索最优策略,但由于信念空间是一个高维连续空间,因此必须使用某种形式的近似。我们比较了两种方法的性能:1)在复制Q学习中,我们学习一个$Q$函数,它是连续信念的线性函数;2)在三重q学习中,将信念替换为参数较少的模型,对连续参数进行量化。仿真用于将这些方法的性能与最常用的方法(以优化的固定速率使用周期性同步)进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Optimizing Synchronization Times for Position Tracking of a Mobile Asset in GPS-denied Environments
A network of distributed agents operating in a GPS-denied environment is tasked with tracking the position of a mobile asset. The agents use time-of-flight (ToF) measurements obtained from transmissions of a beacon signal by the asset to estimate the asset's position, but the estimates are noisy due to clock drift at the agents' local clocks. The clock drift can be reduced by having the asset and network of agents perform a distributed synchronization process; however, synchronization and localization cannot be performed simultaneously because the clock states are not well defined during synchronization and due to constraints on the radios' communication resources. The problem of optimizing when the agents should perform sensing or synchronization is formulated as a partially observed Markov decision process (POMDP) with continuous observations. Table-based Q-learning is used to search for an optimal policy on the belief space of the POMDP, but some form of approximation must be used because the belief space is a high -dimensional continuous space. We compare the performance of two approaches: 1) in replicated Q-learning, we learn a $Q$ function that is a linear function of the continuous beliefs; 2) in triple-Q learning, the beliefs are replaced by a model with fewer parameters, and quantization is used on a continuous parameter. Simulations are used to compare the performance of these methods with the approach that is most commonly used, which is using periodic synchronization at an optimized, fixed rate.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信