Min-Chi Lin, Shih-Chieh Lin, Y. Hwang, Chih-Peng Fan
{"title":"Designs and Comparisons of Facial Direction Detection Technology with YOLO Based Deep Learning Networks","authors":"Min-Chi Lin, Shih-Chieh Lin, Y. Hwang, Chih-Peng Fan","doi":"10.1145/3395245.3396411","DOIUrl":null,"url":null,"abstract":"By utilizing the deep learning model to develop the pedestrian facial direction classifier, in this paper, the proposed You only look once (YOLO)-based deep-learning technology is applied to analyze the images captured by camera to identify the facial directions of pedestrians. To enhance the training effect of mirror categories, the selected images are horizontally flipped to expand the datasets. To avoid misclassification of facial directions, the softmax scheme of the original YOLOv2 model is replaced with the logistic classifier used in YOLOv3, and the improved model is called YOLOv2_ logistic. By the same parameters on the webcam captured dataset, the experimental results show that the YOLOv2_logistic has the best performances on recall, precision, and mean Average Precision (mAP), which are 85%, 81%, 86.28%, respectively. The YOLOv3 tiny has the second performances, and its recall, precision, and mAP is 81%, 80%, 78.83%, respectively. In the frame per second (fps) test, the YOLOv3 tiny has the second best performance, and it reaches 30 fps on the Xavier platform. Although the facial direction detection performance of the YOLOv3 tiny model is slightly lower than that of the YOLOv2_logistic model, the performance of fps by the YOLOv3 tiny model is more than that by the YOLOv2_logistic model.","PeriodicalId":166308,"journal":{"name":"Proceedings of the 2020 8th International Conference on Information and Education Technology","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 8th International Conference on Information and Education Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3395245.3396411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
By utilizing the deep learning model to develop the pedestrian facial direction classifier, in this paper, the proposed You only look once (YOLO)-based deep-learning technology is applied to analyze the images captured by camera to identify the facial directions of pedestrians. To enhance the training effect of mirror categories, the selected images are horizontally flipped to expand the datasets. To avoid misclassification of facial directions, the softmax scheme of the original YOLOv2 model is replaced with the logistic classifier used in YOLOv3, and the improved model is called YOLOv2_ logistic. By the same parameters on the webcam captured dataset, the experimental results show that the YOLOv2_logistic has the best performances on recall, precision, and mean Average Precision (mAP), which are 85%, 81%, 86.28%, respectively. The YOLOv3 tiny has the second performances, and its recall, precision, and mAP is 81%, 80%, 78.83%, respectively. In the frame per second (fps) test, the YOLOv3 tiny has the second best performance, and it reaches 30 fps on the Xavier platform. Although the facial direction detection performance of the YOLOv3 tiny model is slightly lower than that of the YOLOv2_logistic model, the performance of fps by the YOLOv3 tiny model is more than that by the YOLOv2_logistic model.
本文利用深度学习模型开发行人面部方向分类器,将提出的基于You only look once (YOLO)的深度学习技术应用于对摄像头拍摄的图像进行分析,识别行人的面部方向。为了增强镜像分类的训练效果,对选中的图像进行水平翻转,扩展数据集。为了避免面部方向的误分类,将原YOLOv2模型的softmax方案替换为YOLOv3中使用的logistic分类器,将改进后的模型称为YOLOv2_ logistic。实验结果表明,在相同的参数下,YOLOv2_logistic在查全率(recall)、查准率(precision)和平均查准率(mAP)方面表现最佳,分别为85%、81%和86.28%。YOLOv3 tiny的召回率(recall)为81%,准确率(precision)为80%,mAP为78.83%。在每秒帧数(fps)测试中,YOLOv3 tiny具有第二好的性能,在Xavier平台上达到30 fps。虽然YOLOv3微小模型的面部方向检测性能略低于YOLOv2_logistic模型,但YOLOv3微小模型的fps性能优于YOLOv2_logistic模型。