Lin Wang , Jie Li , Chun Qi , Xuan Wu , Runrun Zou , Fengping Wang , Pan Wang
{"title":"一种用于人群计数的邻居感知特征增强网络","authors":"Lin Wang , Jie Li , Chun Qi , Xuan Wu , Runrun Zou , Fengping Wang , Pan Wang","doi":"10.1016/j.imavis.2025.105578","DOIUrl":null,"url":null,"abstract":"<div><div>Deep neural networks have achieved significant progress in the field of crowd counting in recent years. However, many networks still face challenges in effectively representing crowd features due to the insufficient exploitation of inter-channel and inter-pixel relationships. To overcome these limitations, we propose the Neighbor-Aware Feature Enhancement Network (NAFENet), a novel architecture designed to strengthen feature representation by adequately leveraging both channel and pixel dependencies. Specifically, we introduce two modules to model channel dependencies: the Across Channel Attention Module (ACAM) and the Channel Residual Module (CRM). ACAM computes a relevance map to quantify the influence of adjacent channels on the current channel and extracts valuable information to enrich the feature representation. On the other hand, CRM learns the residual maps between adjacent channels to capture their correlations and differences, enabling the network to gain a deeper understanding of the image content. In addition, we embed a Spatial Correlation Module (SCM) in NAFENet to model long-range dependencies between pixels across neighboring rows to analyze long continuous structures more effectively. Experimental results on six challenging datasets demonstrate that the proposed method achieves impressive performance compared to state-of-the-art models. Complexity analysis further reveals that our model is more efficient, requiring less time and fewer computational resources than other approaches.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"159 ","pages":"Article 105578"},"PeriodicalIF":4.2000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A neighbor-aware feature enhancement network for crowd counting\",\"authors\":\"Lin Wang , Jie Li , Chun Qi , Xuan Wu , Runrun Zou , Fengping Wang , Pan Wang\",\"doi\":\"10.1016/j.imavis.2025.105578\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deep neural networks have achieved significant progress in the field of crowd counting in recent years. However, many networks still face challenges in effectively representing crowd features due to the insufficient exploitation of inter-channel and inter-pixel relationships. To overcome these limitations, we propose the Neighbor-Aware Feature Enhancement Network (NAFENet), a novel architecture designed to strengthen feature representation by adequately leveraging both channel and pixel dependencies. Specifically, we introduce two modules to model channel dependencies: the Across Channel Attention Module (ACAM) and the Channel Residual Module (CRM). ACAM computes a relevance map to quantify the influence of adjacent channels on the current channel and extracts valuable information to enrich the feature representation. On the other hand, CRM learns the residual maps between adjacent channels to capture their correlations and differences, enabling the network to gain a deeper understanding of the image content. In addition, we embed a Spatial Correlation Module (SCM) in NAFENet to model long-range dependencies between pixels across neighboring rows to analyze long continuous structures more effectively. Experimental results on six challenging datasets demonstrate that the proposed method achieves impressive performance compared to state-of-the-art models. Complexity analysis further reveals that our model is more efficient, requiring less time and fewer computational resources than other approaches.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"159 \",\"pages\":\"Article 105578\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625001660\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625001660","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A neighbor-aware feature enhancement network for crowd counting
Deep neural networks have achieved significant progress in the field of crowd counting in recent years. However, many networks still face challenges in effectively representing crowd features due to the insufficient exploitation of inter-channel and inter-pixel relationships. To overcome these limitations, we propose the Neighbor-Aware Feature Enhancement Network (NAFENet), a novel architecture designed to strengthen feature representation by adequately leveraging both channel and pixel dependencies. Specifically, we introduce two modules to model channel dependencies: the Across Channel Attention Module (ACAM) and the Channel Residual Module (CRM). ACAM computes a relevance map to quantify the influence of adjacent channels on the current channel and extracts valuable information to enrich the feature representation. On the other hand, CRM learns the residual maps between adjacent channels to capture their correlations and differences, enabling the network to gain a deeper understanding of the image content. In addition, we embed a Spatial Correlation Module (SCM) in NAFENet to model long-range dependencies between pixels across neighboring rows to analyze long continuous structures more effectively. Experimental results on six challenging datasets demonstrate that the proposed method achieves impressive performance compared to state-of-the-art models. Complexity analysis further reveals that our model is more efficient, requiring less time and fewer computational resources than other approaches.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.