Guochen Shen, Faezeh Jamshidi, Decun Dong, Rei ZhG
{"title":"Metro Pedestrian Detection Based on Mask R-CNN and Spatial-temporal Feature","authors":"Guochen Shen, Faezeh Jamshidi, Decun Dong, Rei ZhG","doi":"10.1109/ICICSP50920.2020.9232096","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce the Mask R-CNN, an object detection method based on deep learning networks, to detect the number of pedestrians from surveillance video in the metro train carriage and on the metro station platform, and introduce the fusion of the multi-frame processing result to reduce the detection error. In order to apply and analyze the detection result, we establish a spatial-temporal model of the number of pedestrians in the carriage and on the platform. The experiment shows the efficient result of our method. The average accuracy of the single-frame detection is 73.43%. By fusing the detection result of frames in time series, the average accuracy is 88.85%, which increases 21%. The data of pedestrians’ numbers produced by our method can be helpful for metro management, pedestrian guidance, emergency management and so on.","PeriodicalId":117760,"journal":{"name":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP50920.2020.9232096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we introduce the Mask R-CNN, an object detection method based on deep learning networks, to detect the number of pedestrians from surveillance video in the metro train carriage and on the metro station platform, and introduce the fusion of the multi-frame processing result to reduce the detection error. In order to apply and analyze the detection result, we establish a spatial-temporal model of the number of pedestrians in the carriage and on the platform. The experiment shows the efficient result of our method. The average accuracy of the single-frame detection is 73.43%. By fusing the detection result of frames in time series, the average accuracy is 88.85%, which increases 21%. The data of pedestrians’ numbers produced by our method can be helpful for metro management, pedestrian guidance, emergency management and so on.