{"title":"TransMambaCC:集成变压器和金字塔曼巴网络的RGB-T人群计数","authors":"Yangjian Chen, Huailin Zhao, Liangjun Huang, Yubo Yang, Wencan Kang, Jianwei Zhang","doi":"10.1007/s10489-025-06912-5","DOIUrl":null,"url":null,"abstract":"<div><p>RGB-T crowd counting is a challenging task that integrates RGB and thermal images to address the limitations of RGB-only approaches in scenes with poor illumination or occlusion. While transformer-based models have shown remarkable success in terms of capturing long-range dependencies, their high computational demands limit their practical applicability. To address this issue, a novel hybrid model named TransMambaCC, which fuses the analytical strength of transformer with the computational efficiency of Mamba, is proposed. This integration not only improves crowd analysis performance, but also significantly reduces computational overhead of the model. Additionally, a Pyramid Mamba module is innovatively designed to address the head-scale variations observed in congested scenes. Extensive experiments conducted on the RGBT-CC dataset demonstrate the superiority of TransMambaCC over the existing approaches in terms of both accuracy and efficiency. Furthermore, the model exhibits strong generalization capabilities, as evidenced by its performance on the ShanghaiTechRGBD dataset. The code is available at https://github.com/yjchen3250/TransMambaCC.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 15","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TransMambaCC: Integrating Transformer and Pyramid Mamba Network for RGB-T Crowd Counting\",\"authors\":\"Yangjian Chen, Huailin Zhao, Liangjun Huang, Yubo Yang, Wencan Kang, Jianwei Zhang\",\"doi\":\"10.1007/s10489-025-06912-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>RGB-T crowd counting is a challenging task that integrates RGB and thermal images to address the limitations of RGB-only approaches in scenes with poor illumination or occlusion. While transformer-based models have shown remarkable success in terms of capturing long-range dependencies, their high computational demands limit their practical applicability. To address this issue, a novel hybrid model named TransMambaCC, which fuses the analytical strength of transformer with the computational efficiency of Mamba, is proposed. This integration not only improves crowd analysis performance, but also significantly reduces computational overhead of the model. Additionally, a Pyramid Mamba module is innovatively designed to address the head-scale variations observed in congested scenes. Extensive experiments conducted on the RGBT-CC dataset demonstrate the superiority of TransMambaCC over the existing approaches in terms of both accuracy and efficiency. Furthermore, the model exhibits strong generalization capabilities, as evidenced by its performance on the ShanghaiTechRGBD dataset. The code is available at https://github.com/yjchen3250/TransMambaCC.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 15\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06912-5\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06912-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
TransMambaCC: Integrating Transformer and Pyramid Mamba Network for RGB-T Crowd Counting
RGB-T crowd counting is a challenging task that integrates RGB and thermal images to address the limitations of RGB-only approaches in scenes with poor illumination or occlusion. While transformer-based models have shown remarkable success in terms of capturing long-range dependencies, their high computational demands limit their practical applicability. To address this issue, a novel hybrid model named TransMambaCC, which fuses the analytical strength of transformer with the computational efficiency of Mamba, is proposed. This integration not only improves crowd analysis performance, but also significantly reduces computational overhead of the model. Additionally, a Pyramid Mamba module is innovatively designed to address the head-scale variations observed in congested scenes. Extensive experiments conducted on the RGBT-CC dataset demonstrate the superiority of TransMambaCC over the existing approaches in terms of both accuracy and efficiency. Furthermore, the model exhibits strong generalization capabilities, as evidenced by its performance on the ShanghaiTechRGBD dataset. The code is available at https://github.com/yjchen3250/TransMambaCC.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.