Yang Li , Guanci Yang , Zhidong Su , Shaobo Li , Jing Yang , Ling He
{"title":"基于区域搜索和内存缓冲池的室内场景多目标跟踪","authors":"Yang Li , Guanci Yang , Zhidong Su , Shaobo Li , Jing Yang , Ling He","doi":"10.1016/j.patcog.2025.111623","DOIUrl":null,"url":null,"abstract":"<div><div>This study proposes a new Indoor Scene Multi-Object Tracking (IS-MOT) task to complete multi-granularity parsing and continuously track indoor human objects. To foster the IS-MOT task, we refer to the basic human movement composition, combining indoor human motion characteristics, constructing a large-scale multi-object tracking benchmark for indoor social robot perspective, termed Multi-Resident Tracking (MRT). To address the issue of insufficient persistent tracking capability when extending existing MOT methods to the IS-MOT task. A persistent visual multi-object tracking method based on region search and memory buffer pool (PeViTrack) is designed. PeViTrack is mainly composed of a Homogeneous Semantic Memory Buffer Pool (HSMBP) that integrates a Motion State Estimation Module (MSEM) and a Hierarchical Matching Correlation Mechanism (HMCM). HSMBP allows the network to construct an allocation representation based on high and low confidence detection boxes, thereby establishing homogeneous and heterogeneous semantic embedding decision spaces in the spatial domain, thus forcing the network to search and accurately associate object homogeneous and heterogeneous features efficiently. Extensive experiments on the constructed MRT and the well-recognized DanceTrack dataset show that PeViTrack achieves state-of-the-art tracking performance. The code and datasets will be made available at <span><span>https://github.com/funweb/PeViTrack</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111623"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Indoor scene multi-object tracking based on region search and memory buffer pool\",\"authors\":\"Yang Li , Guanci Yang , Zhidong Su , Shaobo Li , Jing Yang , Ling He\",\"doi\":\"10.1016/j.patcog.2025.111623\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study proposes a new Indoor Scene Multi-Object Tracking (IS-MOT) task to complete multi-granularity parsing and continuously track indoor human objects. To foster the IS-MOT task, we refer to the basic human movement composition, combining indoor human motion characteristics, constructing a large-scale multi-object tracking benchmark for indoor social robot perspective, termed Multi-Resident Tracking (MRT). To address the issue of insufficient persistent tracking capability when extending existing MOT methods to the IS-MOT task. A persistent visual multi-object tracking method based on region search and memory buffer pool (PeViTrack) is designed. PeViTrack is mainly composed of a Homogeneous Semantic Memory Buffer Pool (HSMBP) that integrates a Motion State Estimation Module (MSEM) and a Hierarchical Matching Correlation Mechanism (HMCM). HSMBP allows the network to construct an allocation representation based on high and low confidence detection boxes, thereby establishing homogeneous and heterogeneous semantic embedding decision spaces in the spatial domain, thus forcing the network to search and accurately associate object homogeneous and heterogeneous features efficiently. Extensive experiments on the constructed MRT and the well-recognized DanceTrack dataset show that PeViTrack achieves state-of-the-art tracking performance. The code and datasets will be made available at <span><span>https://github.com/funweb/PeViTrack</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"165 \",\"pages\":\"Article 111623\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325002833\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325002833","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Indoor scene multi-object tracking based on region search and memory buffer pool
This study proposes a new Indoor Scene Multi-Object Tracking (IS-MOT) task to complete multi-granularity parsing and continuously track indoor human objects. To foster the IS-MOT task, we refer to the basic human movement composition, combining indoor human motion characteristics, constructing a large-scale multi-object tracking benchmark for indoor social robot perspective, termed Multi-Resident Tracking (MRT). To address the issue of insufficient persistent tracking capability when extending existing MOT methods to the IS-MOT task. A persistent visual multi-object tracking method based on region search and memory buffer pool (PeViTrack) is designed. PeViTrack is mainly composed of a Homogeneous Semantic Memory Buffer Pool (HSMBP) that integrates a Motion State Estimation Module (MSEM) and a Hierarchical Matching Correlation Mechanism (HMCM). HSMBP allows the network to construct an allocation representation based on high and low confidence detection boxes, thereby establishing homogeneous and heterogeneous semantic embedding decision spaces in the spatial domain, thus forcing the network to search and accurately associate object homogeneous and heterogeneous features efficiently. Extensive experiments on the constructed MRT and the well-recognized DanceTrack dataset show that PeViTrack achieves state-of-the-art tracking performance. The code and datasets will be made available at https://github.com/funweb/PeViTrack.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.