Feature Fusion with Deep Neural Network in Kernelized Correlation Filters Tracker

D. Maharani, C. Machbub, P. Rusmin, L. Yulianti
{"title":"Feature Fusion with Deep Neural Network in Kernelized Correlation Filters Tracker","authors":"D. Maharani, C. Machbub, P. Rusmin, L. Yulianti","doi":"10.1109/ICSET53708.2021.9612567","DOIUrl":null,"url":null,"abstract":"Moving object tracking is the most important component in many computer vision applications. Currently, the ability of computer vision is almost like human vision. Humans can see and track moving objects by looking at notable features such as color, shape, and function. The computer can track moving objects by calculating the characteristics, such as the Histogram of Oriented Gradient (HOG) and grayscale features. These features were used as input in the tracker algorithm. The correlation filter algorithm is extensively used in object tracking applications because of its accuracy and speed. Kernelized Correlation Filters (KCF) is a method that uses correlation for object tracking. The feature fusion is widely used to make tracking more robust. In this paper, the HOG and grayscale features were implemented in the KCF method. Deep Neural Network (DNN) regression was used as a decision feature fusion. With almost similar principle as Non-Maximum Suppression (NMS), where two candidates are detected from overlapping HOG and grayscale features, the region-of-interest (ROI) will be pruned by replacing one ROI to produce a more accurate object candidate. In this study, three TB dataset videos were used for testing, and two videos were used for training. The DNN Regression architecture uses six hidden layers with 512, 256,64,32,16, and 8 nodes. The training accuracy results were 95.76%, with MSE of 9.94 and a loss of 9.93. This research shows that the system can track objects more precisely and robustly with RMSE of 9.38 while achieving 32 FPS.","PeriodicalId":433197,"journal":{"name":"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 11th International Conference on System Engineering and Technology (ICSET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSET53708.2021.9612567","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Moving object tracking is the most important component in many computer vision applications. Currently, the ability of computer vision is almost like human vision. Humans can see and track moving objects by looking at notable features such as color, shape, and function. The computer can track moving objects by calculating the characteristics, such as the Histogram of Oriented Gradient (HOG) and grayscale features. These features were used as input in the tracker algorithm. The correlation filter algorithm is extensively used in object tracking applications because of its accuracy and speed. Kernelized Correlation Filters (KCF) is a method that uses correlation for object tracking. The feature fusion is widely used to make tracking more robust. In this paper, the HOG and grayscale features were implemented in the KCF method. Deep Neural Network (DNN) regression was used as a decision feature fusion. With almost similar principle as Non-Maximum Suppression (NMS), where two candidates are detected from overlapping HOG and grayscale features, the region-of-interest (ROI) will be pruned by replacing one ROI to produce a more accurate object candidate. In this study, three TB dataset videos were used for testing, and two videos were used for training. The DNN Regression architecture uses six hidden layers with 512, 256,64,32,16, and 8 nodes. The training accuracy results were 95.76%, with MSE of 9.94 and a loss of 9.93. This research shows that the system can track objects more precisely and robustly with RMSE of 9.38 while achieving 32 FPS.
基于深度神经网络的核相关滤波跟踪特征融合
运动目标跟踪是许多计算机视觉应用中最重要的组成部分。目前,计算机视觉的能力几乎与人类视觉相当。人类可以通过观察物体的颜色、形状和功能等显著特征来观察和跟踪移动的物体。计算机可以通过计算特征来跟踪运动物体,如定向梯度直方图(HOG)和灰度特征。这些特征被用作跟踪算法的输入。相关滤波算法以其精度高、速度快等优点被广泛应用于目标跟踪领域。kernel - ized Correlation Filters (KCF)是一种利用相关性进行对象跟踪的方法。特征融合被广泛用于增强跟踪的鲁棒性。本文在KCF方法中实现了HOG和灰度特征。采用深度神经网络(DNN)回归作为决策特征融合。利用与非最大抑制(NMS)几乎相似的原理,从重叠的HOG和灰度特征中检测两个候选对象,通过替换一个感兴趣区域(ROI)来修剪感兴趣区域(ROI),以产生更准确的目标候选对象。在本研究中,使用三个TB数据集视频进行测试,使用两个视频进行训练。DNN回归架构使用6个隐藏层,分别有512、256、64、32、16和8个节点。训练正确率为95.76%,MSE为9.94,loss为9.93。研究表明,在RMSE为9.38的情况下,系统可以在达到32 FPS的情况下更加精确和稳健地跟踪目标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信