Guochen Shen, Faezeh Jamshidi, Decun Dong, Rei ZhG
{"title":"基于掩模R-CNN和时空特征的地铁行人检测","authors":"Guochen Shen, Faezeh Jamshidi, Decun Dong, Rei ZhG","doi":"10.1109/ICICSP50920.2020.9232096","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce the Mask R-CNN, an object detection method based on deep learning networks, to detect the number of pedestrians from surveillance video in the metro train carriage and on the metro station platform, and introduce the fusion of the multi-frame processing result to reduce the detection error. In order to apply and analyze the detection result, we establish a spatial-temporal model of the number of pedestrians in the carriage and on the platform. The experiment shows the efficient result of our method. The average accuracy of the single-frame detection is 73.43%. By fusing the detection result of frames in time series, the average accuracy is 88.85%, which increases 21%. The data of pedestrians’ numbers produced by our method can be helpful for metro management, pedestrian guidance, emergency management and so on.","PeriodicalId":117760,"journal":{"name":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Metro Pedestrian Detection Based on Mask R-CNN and Spatial-temporal Feature\",\"authors\":\"Guochen Shen, Faezeh Jamshidi, Decun Dong, Rei ZhG\",\"doi\":\"10.1109/ICICSP50920.2020.9232096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we introduce the Mask R-CNN, an object detection method based on deep learning networks, to detect the number of pedestrians from surveillance video in the metro train carriage and on the metro station platform, and introduce the fusion of the multi-frame processing result to reduce the detection error. In order to apply and analyze the detection result, we establish a spatial-temporal model of the number of pedestrians in the carriage and on the platform. The experiment shows the efficient result of our method. The average accuracy of the single-frame detection is 73.43%. By fusing the detection result of frames in time series, the average accuracy is 88.85%, which increases 21%. The data of pedestrians’ numbers produced by our method can be helpful for metro management, pedestrian guidance, emergency management and so on.\",\"PeriodicalId\":117760,\"journal\":{\"name\":\"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICSP50920.2020.9232096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP50920.2020.9232096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Metro Pedestrian Detection Based on Mask R-CNN and Spatial-temporal Feature
In this paper, we introduce the Mask R-CNN, an object detection method based on deep learning networks, to detect the number of pedestrians from surveillance video in the metro train carriage and on the metro station platform, and introduce the fusion of the multi-frame processing result to reduce the detection error. In order to apply and analyze the detection result, we establish a spatial-temporal model of the number of pedestrians in the carriage and on the platform. The experiment shows the efficient result of our method. The average accuracy of the single-frame detection is 73.43%. By fusing the detection result of frames in time series, the average accuracy is 88.85%, which increases 21%. The data of pedestrians’ numbers produced by our method can be helpful for metro management, pedestrian guidance, emergency management and so on.