{"title":"基于时频分析的视频多尺度人群局部计数与目标检测研究","authors":"Guoyin Ren, Xiaoqi Lu, Yuhao Li","doi":"10.1155/2022/7247757","DOIUrl":null,"url":null,"abstract":"Objective. It has become a very difficult task for cameras to complete real-time crowd counting under congestion conditions. Methods. This paper proposes a DRC-ConvLSTM network, which combines a depth-aware model and depth-adaptive Gaussian kernel to extract the spatial-temporal features and depth-level matching of crowd depth space edge constraints in videos, and finally achieves satisfactory crowd density estimation results. The model is trained with weak supervision on a training set of point-labeled images. The design of the detector is to propose a deep adaptive perception network DRD-NET, which can better initialize the size and position of the head detection frame in the image with the help of density map and RGBD-adaptive perception network. Results. The results show that our method achieves the best performance in RGBD dense video crowd counting on five labeled sequence datasets; the MICC dataset, CrowdFlow dataset, FDST dataset, Mall dataset, and UCSD dataset were evaluated to verify its effectiveness. Conclusion. The experimental results show that the proposed DRD-NET model combined with DRC-ConvLSTM outperforms the existing video crowd counting ConvLSTM model, and the effectiveness of the parameters of each part of the model is further proved by ablation experiments.","PeriodicalId":14776,"journal":{"name":"J. Sensors","volume":"1 1","pages":"1-19"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Research on Local Counting and Object Detection of Multiscale Crowds in Video Based on Time-Frequency Analysis\",\"authors\":\"Guoyin Ren, Xiaoqi Lu, Yuhao Li\",\"doi\":\"10.1155/2022/7247757\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective. It has become a very difficult task for cameras to complete real-time crowd counting under congestion conditions. Methods. This paper proposes a DRC-ConvLSTM network, which combines a depth-aware model and depth-adaptive Gaussian kernel to extract the spatial-temporal features and depth-level matching of crowd depth space edge constraints in videos, and finally achieves satisfactory crowd density estimation results. The model is trained with weak supervision on a training set of point-labeled images. The design of the detector is to propose a deep adaptive perception network DRD-NET, which can better initialize the size and position of the head detection frame in the image with the help of density map and RGBD-adaptive perception network. Results. The results show that our method achieves the best performance in RGBD dense video crowd counting on five labeled sequence datasets; the MICC dataset, CrowdFlow dataset, FDST dataset, Mall dataset, and UCSD dataset were evaluated to verify its effectiveness. Conclusion. The experimental results show that the proposed DRD-NET model combined with DRC-ConvLSTM outperforms the existing video crowd counting ConvLSTM model, and the effectiveness of the parameters of each part of the model is further proved by ablation experiments.\",\"PeriodicalId\":14776,\"journal\":{\"name\":\"J. Sensors\",\"volume\":\"1 1\",\"pages\":\"1-19\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Sensors\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1155/2022/7247757\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Sensors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/7247757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Local Counting and Object Detection of Multiscale Crowds in Video Based on Time-Frequency Analysis
Objective. It has become a very difficult task for cameras to complete real-time crowd counting under congestion conditions. Methods. This paper proposes a DRC-ConvLSTM network, which combines a depth-aware model and depth-adaptive Gaussian kernel to extract the spatial-temporal features and depth-level matching of crowd depth space edge constraints in videos, and finally achieves satisfactory crowd density estimation results. The model is trained with weak supervision on a training set of point-labeled images. The design of the detector is to propose a deep adaptive perception network DRD-NET, which can better initialize the size and position of the head detection frame in the image with the help of density map and RGBD-adaptive perception network. Results. The results show that our method achieves the best performance in RGBD dense video crowd counting on five labeled sequence datasets; the MICC dataset, CrowdFlow dataset, FDST dataset, Mall dataset, and UCSD dataset were evaluated to verify its effectiveness. Conclusion. The experimental results show that the proposed DRD-NET model combined with DRC-ConvLSTM outperforms the existing video crowd counting ConvLSTM model, and the effectiveness of the parameters of each part of the model is further proved by ablation experiments.