Yun Lv, Huiyu Mo, Leibo Liu, S. Yin, Shaojun Wei, Wenping Zhu, Qiang Li
{"title":"A 1096fps Hardware Architecture for Fast Training in Object Tracking","authors":"Yun Lv, Huiyu Mo, Leibo Liu, S. Yin, Shaojun Wei, Wenping Zhu, Qiang Li","doi":"10.1109/ICCSN.2019.8905251","DOIUrl":null,"url":null,"abstract":"In recent years, Discriminative Correlation Filter based methods have significantly outperformed the state-of-the art in tracking accuracy. However, the high-complexity training process makes it hard for the tracking task to keep both high accuracy and speed. In this work, the training algorithm is optimized to significantly reduce its computation with acceptable accuracy loss. Then a dedicated hardware is designed to further accelerate the training process with high accuracy. First, time constraints are released to turn serial module into parallel module; Second, the symmetry and sparsity of regularization filter kernel is utilized to reduce 80% computation of regularization convolution; Third, the computation of inner product module in training is reduced by turning complex numbers calculations into real and imaginary numbers calculations respectively. In conclusion, about 24.19% computation of training process is reduced and 4.30% parallel processing time is saved to get a 1.32x hardware resources improvement and 1.05x speedup than the original process. The simulation results show that the throughput of this hardware achieves 1096fps at 250 MHz, which is especially suitable for tracking tasks with high speed and accuracy requirement.","PeriodicalId":330766,"journal":{"name":"2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSN.2019.8905251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, Discriminative Correlation Filter based methods have significantly outperformed the state-of-the art in tracking accuracy. However, the high-complexity training process makes it hard for the tracking task to keep both high accuracy and speed. In this work, the training algorithm is optimized to significantly reduce its computation with acceptable accuracy loss. Then a dedicated hardware is designed to further accelerate the training process with high accuracy. First, time constraints are released to turn serial module into parallel module; Second, the symmetry and sparsity of regularization filter kernel is utilized to reduce 80% computation of regularization convolution; Third, the computation of inner product module in training is reduced by turning complex numbers calculations into real and imaginary numbers calculations respectively. In conclusion, about 24.19% computation of training process is reduced and 4.30% parallel processing time is saved to get a 1.32x hardware resources improvement and 1.05x speedup than the original process. The simulation results show that the throughput of this hardware achieves 1096fps at 250 MHz, which is especially suitable for tracking tasks with high speed and accuracy requirement.