Fok Hing Chi Tivive;Abdesselam Bouzerdoum;Son Lam Phung;Hoang Thanh Le;Hamza Baali
{"title":"使用分流抑制机制的基于深度学习的人群计数方法","authors":"Fok Hing Chi Tivive;Abdesselam Bouzerdoum;Son Lam Phung;Hoang Thanh Le;Hamza Baali","doi":"10.1109/TAI.2024.3443789","DOIUrl":null,"url":null,"abstract":"Image-based crowd counting has gained significant attention due to its widespread applications in security and surveillance. Recent advancements in deep learning have led to the development of numerous methods that have achieved remarkable success in accurately counting crowds. However, many of the existing deep learning methods, which have large model sizes, are unsuitable for deployment on edge devices. This article introduces a novel network architecture and processing element designed to create an efficient and compact deep learning model for crowd counting. The processing element, referred to as the shunting inhibitory neuron, generates complex decision boundaries, making it more powerful than the traditional perceptron. It is employed in both the encoder and decoder modules of the proposed model for feature extraction. Furthermore, the decoder includes alternating convolutional and transformer layers, which provide local receptive fields and global self-attention, respectively. This design captures rich contextual information that is used for generating accurate segmentation and density maps. The self-attention mechanism is implemented using convolution modulation instead of matrix multiplication to reduce computational costs. Experiments conducted on three challenging crowd counting datasets demonstrate that the proposed deep learning network, which comprises a small model size, achieves crowd counting performance comparable to that of state-of-the-art techniques. Codes are available at \n<uri>https://github.com/ftivive/SINet</uri>\n.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 11","pages":"5733-5745"},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Learning-Based Method for Crowd Counting Using Shunting Inhibition Mechanism\",\"authors\":\"Fok Hing Chi Tivive;Abdesselam Bouzerdoum;Son Lam Phung;Hoang Thanh Le;Hamza Baali\",\"doi\":\"10.1109/TAI.2024.3443789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image-based crowd counting has gained significant attention due to its widespread applications in security and surveillance. Recent advancements in deep learning have led to the development of numerous methods that have achieved remarkable success in accurately counting crowds. However, many of the existing deep learning methods, which have large model sizes, are unsuitable for deployment on edge devices. This article introduces a novel network architecture and processing element designed to create an efficient and compact deep learning model for crowd counting. The processing element, referred to as the shunting inhibitory neuron, generates complex decision boundaries, making it more powerful than the traditional perceptron. It is employed in both the encoder and decoder modules of the proposed model for feature extraction. Furthermore, the decoder includes alternating convolutional and transformer layers, which provide local receptive fields and global self-attention, respectively. This design captures rich contextual information that is used for generating accurate segmentation and density maps. The self-attention mechanism is implemented using convolution modulation instead of matrix multiplication to reduce computational costs. Experiments conducted on three challenging crowd counting datasets demonstrate that the proposed deep learning network, which comprises a small model size, achieves crowd counting performance comparable to that of state-of-the-art techniques. Codes are available at \\n<uri>https://github.com/ftivive/SINet</uri>\\n.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"5 11\",\"pages\":\"5733-5745\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10636204/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10636204/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Deep Learning-Based Method for Crowd Counting Using Shunting Inhibition Mechanism
Image-based crowd counting has gained significant attention due to its widespread applications in security and surveillance. Recent advancements in deep learning have led to the development of numerous methods that have achieved remarkable success in accurately counting crowds. However, many of the existing deep learning methods, which have large model sizes, are unsuitable for deployment on edge devices. This article introduces a novel network architecture and processing element designed to create an efficient and compact deep learning model for crowd counting. The processing element, referred to as the shunting inhibitory neuron, generates complex decision boundaries, making it more powerful than the traditional perceptron. It is employed in both the encoder and decoder modules of the proposed model for feature extraction. Furthermore, the decoder includes alternating convolutional and transformer layers, which provide local receptive fields and global self-attention, respectively. This design captures rich contextual information that is used for generating accurate segmentation and density maps. The self-attention mechanism is implemented using convolution modulation instead of matrix multiplication to reduce computational costs. Experiments conducted on three challenging crowd counting datasets demonstrate that the proposed deep learning network, which comprises a small model size, achieves crowd counting performance comparable to that of state-of-the-art techniques. Codes are available at
https://github.com/ftivive/SINet
.