HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking

Haoxian Zhang, Yonggen Ling
{"title":"HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking","authors":"Haoxian Zhang, Yonggen Ling","doi":"10.48550/arXiv.2209.08924","DOIUrl":null,"url":null,"abstract":"Robust and accurate planar tracking over a whole video sequence is vitally important for many vision applications. The key to planar object tracking is to find object correspondences, modeled by homography, between the reference image and the tracked image. Existing methods tend to obtain wrong correspondences with changing appearance variations, camera-object relative motions and occlusions. To alleviate this problem, we present a unified convolutional neural network (CNN) model that jointly considers homography, visibility, and confidence. First, we introduce correlation blocks that explicitly account for the local appearance changes and camera-object relative motions as the base of our model. Second, we jointly learn the homography and visibility that links camera-object relative motions with occlusions. Third, we propose a confidence module that actively monitors the estimation quality from the pixel correlation distributions obtained in correlation blocks. All these modules are plugged into a Lucas-Kanade (LK) tracking pipeline to obtain both accurate and robust planar object tracking. Our approach outperforms the state-of-the-art methods on public POT and TMT datasets. Its superior performance is also verified on a real-world application, synthesizing high-quality in-video advertisements.","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"26 1","pages":"701-718"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.08924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Robust and accurate planar tracking over a whole video sequence is vitally important for many vision applications. The key to planar object tracking is to find object correspondences, modeled by homography, between the reference image and the tracked image. Existing methods tend to obtain wrong correspondences with changing appearance variations, camera-object relative motions and occlusions. To alleviate this problem, we present a unified convolutional neural network (CNN) model that jointly considers homography, visibility, and confidence. First, we introduce correlation blocks that explicitly account for the local appearance changes and camera-object relative motions as the base of our model. Second, we jointly learn the homography and visibility that links camera-object relative motions with occlusions. Third, we propose a confidence module that actively monitors the estimation quality from the pixel correlation distributions obtained in correlation blocks. All these modules are plugged into a Lucas-Kanade (LK) tracking pipeline to obtain both accurate and robust planar object tracking. Our approach outperforms the state-of-the-art methods on public POT and TMT datasets. Its superior performance is also verified on a real-world application, synthesizing high-quality in-video advertisements.
HVC-Net:平面对象跟踪的统一单应性、可见性和置信度学习
在许多视觉应用中,对整个视频序列进行鲁棒和精确的平面跟踪是至关重要的。平面目标跟踪的关键是找到参考图像与被跟踪图像之间的对象对应关系。现有的方法往往得到错误的对应变化的外观变化,相机-对象相对运动和遮挡。为了缓解这个问题,我们提出了一个统一的卷积神经网络(CNN)模型,该模型联合考虑了单应性、可见性和置信度。首先,我们引入了相关块,明确地解释了局部外观变化和相机-物体相对运动作为我们模型的基础。其次,我们共同学习了将相机-物体相对运动与遮挡联系起来的单应性和可见性。第三,我们提出了一个置信度模块,从相关块中获得的像素相关分布中主动监控估计质量。所有这些模块都插入到卢卡斯-卡纳德(LK)跟踪管道中,以获得精确和鲁棒的平面目标跟踪。我们的方法在公共POT和TMT数据集上优于最先进的方法。其优越的性能也在实际应用中得到验证,合成了高质量的视频内广告。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信