基于区域的高分辨率Siamese网络鲁棒视觉跟踪

Chunbao Li, Bo Yang
{"title":"基于区域的高分辨率Siamese网络鲁棒视觉跟踪","authors":"Chunbao Li, Bo Yang","doi":"10.1145/3354031.3354051","DOIUrl":null,"url":null,"abstract":"Visual tracking is an active and challenging research topic in computer vision, as objects often undergo significant appearance variations caused by occlusion, deformation and background clutter. In recent years, many convolutional neural network based trackers have achieved impressive performance by integrating multi-layer features. However, in order to conduct multi-scale feature fusion, most of these trackers recover high-resolution presentations from low-resolution representations produced by a high-to-low resolution network, which tend to result in inaccurate feature maps or lose of details of the target object. In this paper, we propose an end-to-end region-based high-resolution fully convolutional Siamese network for tracking. In the tracker, we propose to extract the spatial information and semantic information of the target object using a high-resolution network that maintains rich high-resolution representations of the target object through the whole process. Furthermore, a set of position-sensitive score maps are obtained for all regions of the target template, and an adaptive weighting method is proposed to fuse score maps of multiple regions. Experimental results on the OTB50 and OTB100 benchmark datasets demonstrate that our tracker performs better than several state-of-the-art trackers while running in real-time.","PeriodicalId":286321,"journal":{"name":"Proceedings of the 4th International Conference on Biomedical Signal and Image Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Region-based High-resolution Siamese Network for Robust Visual Tracking\",\"authors\":\"Chunbao Li, Bo Yang\",\"doi\":\"10.1145/3354031.3354051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual tracking is an active and challenging research topic in computer vision, as objects often undergo significant appearance variations caused by occlusion, deformation and background clutter. In recent years, many convolutional neural network based trackers have achieved impressive performance by integrating multi-layer features. However, in order to conduct multi-scale feature fusion, most of these trackers recover high-resolution presentations from low-resolution representations produced by a high-to-low resolution network, which tend to result in inaccurate feature maps or lose of details of the target object. In this paper, we propose an end-to-end region-based high-resolution fully convolutional Siamese network for tracking. In the tracker, we propose to extract the spatial information and semantic information of the target object using a high-resolution network that maintains rich high-resolution representations of the target object through the whole process. Furthermore, a set of position-sensitive score maps are obtained for all regions of the target template, and an adaptive weighting method is proposed to fuse score maps of multiple regions. Experimental results on the OTB50 and OTB100 benchmark datasets demonstrate that our tracker performs better than several state-of-the-art trackers while running in real-time.\",\"PeriodicalId\":286321,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Biomedical Signal and Image Processing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Biomedical Signal and Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3354031.3354051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Biomedical Signal and Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3354031.3354051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

视觉跟踪是计算机视觉中一个活跃而富有挑战性的研究课题,由于遮挡、变形和背景杂波等原因,物体的外观往往会发生显著变化。近年来,许多基于卷积神经网络的跟踪器通过集成多层特征,取得了令人印象深刻的性能。然而,为了进行多尺度特征融合,这些跟踪器大多是从高到低分辨率网络产生的低分辨率表示中恢复高分辨率表示,这往往导致不准确的特征映射或丢失目标物体的细节。在本文中,我们提出了一个端到端基于区域的高分辨率全卷积Siamese网络用于跟踪。在跟踪器中,我们提出使用高分辨率网络提取目标物体的空间信息和语义信息,该网络在整个过程中保持目标物体丰富的高分辨率表示。在此基础上,对目标模板的所有区域获得了一组位置敏感的评分图,并提出了一种融合多区域评分图的自适应加权方法。在OTB50和OTB100基准数据集上的实验结果表明,我们的跟踪器在实时运行时的性能优于几种最先进的跟踪器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Region-based High-resolution Siamese Network for Robust Visual Tracking
Visual tracking is an active and challenging research topic in computer vision, as objects often undergo significant appearance variations caused by occlusion, deformation and background clutter. In recent years, many convolutional neural network based trackers have achieved impressive performance by integrating multi-layer features. However, in order to conduct multi-scale feature fusion, most of these trackers recover high-resolution presentations from low-resolution representations produced by a high-to-low resolution network, which tend to result in inaccurate feature maps or lose of details of the target object. In this paper, we propose an end-to-end region-based high-resolution fully convolutional Siamese network for tracking. In the tracker, we propose to extract the spatial information and semantic information of the target object using a high-resolution network that maintains rich high-resolution representations of the target object through the whole process. Furthermore, a set of position-sensitive score maps are obtained for all regions of the target template, and an adaptive weighting method is proposed to fuse score maps of multiple regions. Experimental results on the OTB50 and OTB100 benchmark datasets demonstrate that our tracker performs better than several state-of-the-art trackers while running in real-time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信