Wenjuan Shi, Xiangwei Zheng, Lifeng Zhang, Cun Ji, Yuang Zhang, Ji Bian
{"title":"基于最优运输和协调注意机制的多目标跟踪","authors":"Wenjuan Shi, Xiangwei Zheng, Lifeng Zhang, Cun Ji, Yuang Zhang, Ji Bian","doi":"10.1016/j.sigpro.2025.110058","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-Object Tracking (MOT) has currently attracted significant interest due to its wide applications in various fields, such as autonomous driving, intelligent surveillance, and behavior recognition. However, appearance similarity of different objects results in low accuracy of target matching and difficulties in data association. In this paper, we propose a Multi-Object Tracking based on Optimal Transport and Coordinate Attention Mechanism (MOT2A), which addresses above challenges by integrating the attention mechanism with optimal transport. These strategies effectively enhance the extraction of discriminative appearance features and improve target matching between different frames. Firstly, we construct a novel Coordinate attention module (CASA), which models the interdependence between the channel domain and the spatial domain of the feature map. Secondly, a Triplet loss with optimal transport (SK-Triplet) is designed to adjust the distance matrix for effective clustering of positive and negative samples during loss calculation. Finally, extensive experiments are conducted on MOT17 and MOT20. For MOT17: 79.4 MOTA, 78.9 IDF1, and 63.9 HOTA; For MOT20: 77.0 MOTA, 76.3 IDF1, and 62.3 HOTA are achieved, respectively. Compared to existing MOT methods, our method shows significant improvements in accuracy and stability. The code is available at: <span><span>https://github.com/420-s/MOT2A</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"236 ","pages":"Article 110058"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Object Tracking based on Optimal Transport and Coordinate Attention Mechanism\",\"authors\":\"Wenjuan Shi, Xiangwei Zheng, Lifeng Zhang, Cun Ji, Yuang Zhang, Ji Bian\",\"doi\":\"10.1016/j.sigpro.2025.110058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-Object Tracking (MOT) has currently attracted significant interest due to its wide applications in various fields, such as autonomous driving, intelligent surveillance, and behavior recognition. However, appearance similarity of different objects results in low accuracy of target matching and difficulties in data association. In this paper, we propose a Multi-Object Tracking based on Optimal Transport and Coordinate Attention Mechanism (MOT2A), which addresses above challenges by integrating the attention mechanism with optimal transport. These strategies effectively enhance the extraction of discriminative appearance features and improve target matching between different frames. Firstly, we construct a novel Coordinate attention module (CASA), which models the interdependence between the channel domain and the spatial domain of the feature map. Secondly, a Triplet loss with optimal transport (SK-Triplet) is designed to adjust the distance matrix for effective clustering of positive and negative samples during loss calculation. Finally, extensive experiments are conducted on MOT17 and MOT20. For MOT17: 79.4 MOTA, 78.9 IDF1, and 63.9 HOTA; For MOT20: 77.0 MOTA, 76.3 IDF1, and 62.3 HOTA are achieved, respectively. Compared to existing MOT methods, our method shows significant improvements in accuracy and stability. The code is available at: <span><span>https://github.com/420-s/MOT2A</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49523,\"journal\":{\"name\":\"Signal Processing\",\"volume\":\"236 \",\"pages\":\"Article 110058\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165168425001720\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425001720","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Multi-Object Tracking based on Optimal Transport and Coordinate Attention Mechanism
Multi-Object Tracking (MOT) has currently attracted significant interest due to its wide applications in various fields, such as autonomous driving, intelligent surveillance, and behavior recognition. However, appearance similarity of different objects results in low accuracy of target matching and difficulties in data association. In this paper, we propose a Multi-Object Tracking based on Optimal Transport and Coordinate Attention Mechanism (MOT2A), which addresses above challenges by integrating the attention mechanism with optimal transport. These strategies effectively enhance the extraction of discriminative appearance features and improve target matching between different frames. Firstly, we construct a novel Coordinate attention module (CASA), which models the interdependence between the channel domain and the spatial domain of the feature map. Secondly, a Triplet loss with optimal transport (SK-Triplet) is designed to adjust the distance matrix for effective clustering of positive and negative samples during loss calculation. Finally, extensive experiments are conducted on MOT17 and MOT20. For MOT17: 79.4 MOTA, 78.9 IDF1, and 63.9 HOTA; For MOT20: 77.0 MOTA, 76.3 IDF1, and 62.3 HOTA are achieved, respectively. Compared to existing MOT methods, our method shows significant improvements in accuracy and stability. The code is available at: https://github.com/420-s/MOT2A.
期刊介绍:
Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing.
Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.