Dynamic Multi-Loss Weighting for Multiple People Tracking in Video Surveillance Systems

Xuan-Thuy Vo, T. Tran, Duy-Linh Nguyen, K. Jo
{"title":"Dynamic Multi-Loss Weighting for Multiple People Tracking in Video Surveillance Systems","authors":"Xuan-Thuy Vo, T. Tran, Duy-Linh Nguyen, K. Jo","doi":"10.1109/INDIN45523.2021.9557515","DOIUrl":null,"url":null,"abstract":"Multiple people tracking is a fundamental yet challenging task in the computer vision field, which served as a primary process for high-level tasks such as human behaviors, action recognition, pose estimation. Person tracking is decomposed into detection and re-identification (re-ID) sub-tasks. Conventionally, the detection learns classification and regression objectives simultaneously; and the re-ID sub-task is treated as a classification task. Therefore, person tracking is multiple task learning corresponding to multiple loss functions (multiple objectives) with one bounding box regression and two classifications. The difference between various tasks is as follows: the ranges of each objective are inconsistent, the contribution of each task to the overall gradient is altered, and the learning pace of each task is different (level of difficulty). It leads to an objective imbalance in multi-task learning. Previous methods proposed weighting factors as new hyper-parameters to balance the ranges of each task. The dimension of search space for manually tuning these hyper-parameters is high, which depends on the number of tasks. Accordingly, selecting reasonable weighting factors is difficult and complicated. This paper introduces dynamic multi-loss weighting (DMW) with simple but effective in which the weighting factors are dynamically changed during training without introducing any hyper-parameters. The dynamic weights are optimized to balance regression and classification objectives, which depend on the difficulty level of each task and the correlation between each task. Additionally, the general convolution operations are spatially invariant to some degree, which hinders the network’s performance. Hence, this work employs the position-sensitive operation improving feature extraction. The proposed method is conducted on the MOT17 challenging benchmark, which outperforms the online multiple people trackers without using additional data.","PeriodicalId":370921,"journal":{"name":"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN45523.2021.9557515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Multiple people tracking is a fundamental yet challenging task in the computer vision field, which served as a primary process for high-level tasks such as human behaviors, action recognition, pose estimation. Person tracking is decomposed into detection and re-identification (re-ID) sub-tasks. Conventionally, the detection learns classification and regression objectives simultaneously; and the re-ID sub-task is treated as a classification task. Therefore, person tracking is multiple task learning corresponding to multiple loss functions (multiple objectives) with one bounding box regression and two classifications. The difference between various tasks is as follows: the ranges of each objective are inconsistent, the contribution of each task to the overall gradient is altered, and the learning pace of each task is different (level of difficulty). It leads to an objective imbalance in multi-task learning. Previous methods proposed weighting factors as new hyper-parameters to balance the ranges of each task. The dimension of search space for manually tuning these hyper-parameters is high, which depends on the number of tasks. Accordingly, selecting reasonable weighting factors is difficult and complicated. This paper introduces dynamic multi-loss weighting (DMW) with simple but effective in which the weighting factors are dynamically changed during training without introducing any hyper-parameters. The dynamic weights are optimized to balance regression and classification objectives, which depend on the difficulty level of each task and the correlation between each task. Additionally, the general convolution operations are spatially invariant to some degree, which hinders the network’s performance. Hence, this work employs the position-sensitive operation improving feature extraction. The proposed method is conducted on the MOT17 challenging benchmark, which outperforms the online multiple people trackers without using additional data.
视频监控系统中多人跟踪的动态多损失加权
多人跟踪是计算机视觉领域的一项基础而又具有挑战性的任务,它是人类行为、动作识别、姿态估计等高级任务的主要过程。人员跟踪被分解为检测和重新识别子任务。通常,检测同时学习分类和回归目标;将重标识子任务作为分类任务处理。因此,人员跟踪是一个边界盒回归和两个分类的多个损失函数(多目标)对应的多任务学习。不同任务之间的区别在于:每个目标的范围不一致,每个任务对整体梯度的贡献改变,每个任务的学习速度不同(难度等级)。它导致了多任务学习中的客观失衡。以前的方法提出了加权因子作为新的超参数来平衡每个任务的范围。手动调优这些超参数的搜索空间维度很高,这取决于任务的数量。因此,选择合理的权重因子是困难和复杂的。本文介绍了一种简单有效的动态多损失加权方法,该方法在训练过程中动态改变加权因子,不引入任何超参数。根据每个任务的难易程度和任务之间的相关性,优化动态权重以平衡回归目标和分类目标。此外,一般的卷积运算在一定程度上是空间不变的,这影响了网络的性能。因此,本文采用位置敏感操作改进特征提取。该方法在MOT17挑战性基准测试上进行,在不使用额外数据的情况下优于在线多人跟踪器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信