基于三损失的航空平台卫星图像定位

24th Irish Machine Vision and Image Processing Conference Pub Date : 2022-08-31 DOI:10.56541/pjfn5642

Eduardo Andres Avila Herrera, Tim McCarhy, J. McDonald

{"title":"基于三损失的航空平台卫星图像定位","authors":"Eduardo Andres Avila Herrera, Tim McCarhy, J. McDonald","doi":"10.56541/pjfn5642","DOIUrl":null,"url":null,"abstract":"We present a vision-based technique for aerial platform localisation using satellite imagery. Our approach applies a modified VGG16 network in conjunction with a triplet loss to encode aerial views as discriminative scene embeddings. The platform is localised by comparing the encodding of its current view with a database of pre-encoded embeddings using a cosine similarity metric. Recent image-based localisation research has shown potential for such learned embeddings, however, to ensure reliable matching they require dense sampling of views of the environment, thereby limiting their operational area. In contrast, the combination of our proposed architecture in conjunction with the triplet loss shows robustness over greater spatial shifts, reducing the need for dense sampling. We demonstrate these improvements through comparison with a state-of-the-art approach using simulated ground truth sequences derived from a real-world satellite dataset covering a 1.5km × 1km region in Karslruhe.","PeriodicalId":180076,"journal":{"name":"24th Irish Machine Vision and Image Processing Conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Triple Loss based Satellite Image Localisation for Aerial Platforms\",\"authors\":\"Eduardo Andres Avila Herrera, Tim McCarhy, J. McDonald\",\"doi\":\"10.56541/pjfn5642\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a vision-based technique for aerial platform localisation using satellite imagery. Our approach applies a modified VGG16 network in conjunction with a triplet loss to encode aerial views as discriminative scene embeddings. The platform is localised by comparing the encodding of its current view with a database of pre-encoded embeddings using a cosine similarity metric. Recent image-based localisation research has shown potential for such learned embeddings, however, to ensure reliable matching they require dense sampling of views of the environment, thereby limiting their operational area. In contrast, the combination of our proposed architecture in conjunction with the triplet loss shows robustness over greater spatial shifts, reducing the need for dense sampling. We demonstrate these improvements through comparison with a state-of-the-art approach using simulated ground truth sequences derived from a real-world satellite dataset covering a 1.5km × 1km region in Karslruhe.\",\"PeriodicalId\":180076,\"journal\":{\"name\":\"24th Irish Machine Vision and Image Processing Conference\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"24th Irish Machine Vision and Image Processing Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.56541/pjfn5642\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"24th Irish Machine Vision and Image Processing Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56541/pjfn5642","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们提出了一种基于视觉的技术，利用卫星图像进行空中平台定位。我们的方法将改进的VGG16网络与三重损失相结合，将鸟瞰图编码为判别场景嵌入。该平台通过使用余弦相似度度量将其当前视图的编码与预编码嵌入的数据库进行比较来定位。最近基于图像的定位研究显示了这种学习嵌入的潜力，然而，为了确保可靠的匹配，它们需要对环境视图进行密集采样，从而限制了它们的操作区域。相比之下，我们提出的结构与三重态损失的结合在更大的空间位移上显示出鲁棒性，减少了对密集采样的需求。通过与一种最先进的方法进行比较，我们展示了这些改进，该方法使用了来自卡尔斯鲁厄1.5公里× 1公里区域的真实世界卫星数据集的模拟地面真值序列。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Triple Loss based Satellite Image Localisation for Aerial Platforms

We present a vision-based technique for aerial platform localisation using satellite imagery. Our approach applies a modified VGG16 network in conjunction with a triplet loss to encode aerial views as discriminative scene embeddings. The platform is localised by comparing the encodding of its current view with a database of pre-encoded embeddings using a cosine similarity metric. Recent image-based localisation research has shown potential for such learned embeddings, however, to ensure reliable matching they require dense sampling of views of the environment, thereby limiting their operational area. In contrast, the combination of our proposed architecture in conjunction with the triplet loss shows robustness over greater spatial shifts, reducing the need for dense sampling. We demonstrate these improvements through comparison with a state-of-the-art approach using simulated ground truth sequences derived from a real-world satellite dataset covering a 1.5km × 1km region in Karslruhe.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

24th Irish Machine Vision and Image Processing Conference

自引率

0.00%

发文量