Binyu Zhang, Yunhao Du, Yanyun Zhao, Jun-Jun Wan, Zhihang Tong
{"title":"I-MMCCN: Improved MMCCN for RGB-T Crowd Counting of Drone Images","authors":"Binyu Zhang, Yunhao Du, Yanyun Zhao, Jun-Jun Wan, Zhihang Tong","doi":"10.1109/IC-NIDC54101.2021.9660586","DOIUrl":null,"url":null,"abstract":"Crowd counting is a critical technique in many artificial intelligent applications, such as security monitoring and automatic transportation management. However, due to the variations in object scales, illumination and image quality, crowd counting from drone images is full of challenges. To fully delve the information hidden in the multi-modal RGB-T images shot by drones for crowd counting, we proposed a hard examples mining module and a novel Block Mean Absolute Error loss (BMAE) to improve Multi-Modal Crowd Counting Network (MMCCN). With the local structural supervision introduced by BMAE loss, the network can incorporate local spatial correlation within each block and focus on the local pattern of people. Besides, BMAE is more similar to the evaluation metrics. By combining our proposed hard example mining module and BMAE loss with MMCCN, we obtain our Improved MMCCN, named as I-MMCCN. Experiments on the DroneRGBT dataset verify the effectiveness of our I-MMCCN. It achieves 1.01 MAE and 1.48 RMSE lower than MMCCN on DroneRGBT validation set.","PeriodicalId":264468,"journal":{"name":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC-NIDC54101.2021.9660586","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Crowd counting is a critical technique in many artificial intelligent applications, such as security monitoring and automatic transportation management. However, due to the variations in object scales, illumination and image quality, crowd counting from drone images is full of challenges. To fully delve the information hidden in the multi-modal RGB-T images shot by drones for crowd counting, we proposed a hard examples mining module and a novel Block Mean Absolute Error loss (BMAE) to improve Multi-Modal Crowd Counting Network (MMCCN). With the local structural supervision introduced by BMAE loss, the network can incorporate local spatial correlation within each block and focus on the local pattern of people. Besides, BMAE is more similar to the evaluation metrics. By combining our proposed hard example mining module and BMAE loss with MMCCN, we obtain our Improved MMCCN, named as I-MMCCN. Experiments on the DroneRGBT dataset verify the effectiveness of our I-MMCCN. It achieves 1.01 MAE and 1.48 RMSE lower than MMCCN on DroneRGBT validation set.