Ngoc Hoang Nguyen, Nhat Nguyen Xuan, T. Bui, Dao Huu Hung, S. Q. Truong, V. Hoang
{"title":"一种有效的监控摄像机实时异常行为识别方法","authors":"Ngoc Hoang Nguyen, Nhat Nguyen Xuan, T. Bui, Dao Huu Hung, S. Q. Truong, V. Hoang","doi":"10.1109/FG57933.2023.10042648","DOIUrl":null,"url":null,"abstract":"In recent years, abnormal human behavior recognition has become an attractive research topic of computer vision due to the rapid growth of demand to monitor human activities on closed-circuit television (CCTV) cameras. However, developing a deep learning-based model for abnormal/violent behavior recognition in surveillance systems is still quite challenging and costly due to inadequate data and model complexity. This paper presents an efficient approach to recognize violent behavior such as fighting, sexual harassment, and climbing fence in real-time on a multi-camera-one-edge-device system. Our approach develops a lightweight 3DCNN model trained by an effective optimization process to recognize human behavior from sequence frames of CCTV video signal input. In the optimization method, we utilize two advantages of deep learning techniques of knowledge distillation and contrastive learning to enhance the quality of the lightweight model on recognizing recorded human behaviors, which can help the student network learn distilled information from both the bigger model and contrastive object representations. We also establish a large CCTV human behavior video dataset containing 4,200 abnormal and 24,000 normal videos. The effectiveness of the proposed approach is shown by the high inference performance and impressive results evaluated on both public datasets the RWF-2000 dataset, the UCF101 dataset, and our collected datasets.","PeriodicalId":318766,"journal":{"name":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","volume":"323 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient approach for real-time abnormal human behavior recognition on surveillance cameras\",\"authors\":\"Ngoc Hoang Nguyen, Nhat Nguyen Xuan, T. Bui, Dao Huu Hung, S. Q. Truong, V. Hoang\",\"doi\":\"10.1109/FG57933.2023.10042648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, abnormal human behavior recognition has become an attractive research topic of computer vision due to the rapid growth of demand to monitor human activities on closed-circuit television (CCTV) cameras. However, developing a deep learning-based model for abnormal/violent behavior recognition in surveillance systems is still quite challenging and costly due to inadequate data and model complexity. This paper presents an efficient approach to recognize violent behavior such as fighting, sexual harassment, and climbing fence in real-time on a multi-camera-one-edge-device system. Our approach develops a lightweight 3DCNN model trained by an effective optimization process to recognize human behavior from sequence frames of CCTV video signal input. In the optimization method, we utilize two advantages of deep learning techniques of knowledge distillation and contrastive learning to enhance the quality of the lightweight model on recognizing recorded human behaviors, which can help the student network learn distilled information from both the bigger model and contrastive object representations. We also establish a large CCTV human behavior video dataset containing 4,200 abnormal and 24,000 normal videos. The effectiveness of the proposed approach is shown by the high inference performance and impressive results evaluated on both public datasets the RWF-2000 dataset, the UCF101 dataset, and our collected datasets.\",\"PeriodicalId\":318766,\"journal\":{\"name\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"volume\":\"323 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FG57933.2023.10042648\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FG57933.2023.10042648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An efficient approach for real-time abnormal human behavior recognition on surveillance cameras
In recent years, abnormal human behavior recognition has become an attractive research topic of computer vision due to the rapid growth of demand to monitor human activities on closed-circuit television (CCTV) cameras. However, developing a deep learning-based model for abnormal/violent behavior recognition in surveillance systems is still quite challenging and costly due to inadequate data and model complexity. This paper presents an efficient approach to recognize violent behavior such as fighting, sexual harassment, and climbing fence in real-time on a multi-camera-one-edge-device system. Our approach develops a lightweight 3DCNN model trained by an effective optimization process to recognize human behavior from sequence frames of CCTV video signal input. In the optimization method, we utilize two advantages of deep learning techniques of knowledge distillation and contrastive learning to enhance the quality of the lightweight model on recognizing recorded human behaviors, which can help the student network learn distilled information from both the bigger model and contrastive object representations. We also establish a large CCTV human behavior video dataset containing 4,200 abnormal and 24,000 normal videos. The effectiveness of the proposed approach is shown by the high inference performance and impressive results evaluated on both public datasets the RWF-2000 dataset, the UCF101 dataset, and our collected datasets.