同时人脸检测和360度头部姿态估计

2021 13th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2021-11-10 DOI:10.1109/KSE53942.2021.9648838

Hoang Nguyen Viet, Linh Nguyen Viet, T. N. Dinh, Duc Tran Minh, Long Tran Quoc

{"title":"同时人脸检测和360度头部姿态估计","authors":"Hoang Nguyen Viet, Linh Nguyen Viet, T. N. Dinh, Duc Tran Minh, Long Tran Quoc","doi":"10.1109/KSE53942.2021.9648838","DOIUrl":null,"url":null,"abstract":"With many practical applications in human life, including manufacturing surveillance cameras, analyzing and processing customer behavior, many researchers are noticing face detection and head pose estimation on digital images. A large number of proposed deep learning models have state-of-the-art accuracy such as YOLO, SSD, MTCNN, solving the problem of face detection or HopeNet, FSA-Net, RankPose model used for head pose estimation problem. According to many state-of-the-art methods, the pipeline of this task consists of two parts, from face detection to head pose estimation. These two steps are completely independent and do not share information. This makes the model clear in setup but does not leverage most of the featured resources extracted in each model. In this paper, we proposed the Multitask-Net model with the motivation to leverage the features extracted from the face detection model, sharing them with the head pose estimation branch to improve accuracy. Also, with the variety of data, the Euler angle domain representing the face is large, our model can predict with results in the 360° Euler angle domain. Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head. To increase the ability to predict the head direction of the model, we change the representation of the human face from the Euler angle to vectors of the Rotation matrix.","PeriodicalId":130986,"journal":{"name":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Simultaneous face detection and 360 degree head pose estimation\",\"authors\":\"Hoang Nguyen Viet, Linh Nguyen Viet, T. N. Dinh, Duc Tran Minh, Long Tran Quoc\",\"doi\":\"10.1109/KSE53942.2021.9648838\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With many practical applications in human life, including manufacturing surveillance cameras, analyzing and processing customer behavior, many researchers are noticing face detection and head pose estimation on digital images. A large number of proposed deep learning models have state-of-the-art accuracy such as YOLO, SSD, MTCNN, solving the problem of face detection or HopeNet, FSA-Net, RankPose model used for head pose estimation problem. According to many state-of-the-art methods, the pipeline of this task consists of two parts, from face detection to head pose estimation. These two steps are completely independent and do not share information. This makes the model clear in setup but does not leverage most of the featured resources extracted in each model. In this paper, we proposed the Multitask-Net model with the motivation to leverage the features extracted from the face detection model, sharing them with the head pose estimation branch to improve accuracy. Also, with the variety of data, the Euler angle domain representing the face is large, our model can predict with results in the 360° Euler angle domain. Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head. To increase the ability to predict the head direction of the model, we change the representation of the human face from the Euler angle to vectors of the Rotation matrix.\",\"PeriodicalId\":130986,\"journal\":{\"name\":\"2021 13th International Conference on Knowledge and Systems Engineering (KSE)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Knowledge and Systems Engineering (KSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KSE53942.2021.9648838\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE53942.2021.9648838","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

随着在人类生活中的许多实际应用，包括制造监控摄像机，分析和处理客户行为，许多研究人员注意到数字图像的人脸检测和头部姿势估计。大量提出的深度学习模型具有最先进的精度，如YOLO, SSD, MTCNN，解决人脸检测问题或用于头姿估计问题的HopeNet, FSA-Net, RankPose模型。根据许多最先进的方法，该任务的流程包括两个部分，从人脸检测到头部姿态估计。这两个步骤是完全独立的，不共享信息。这使得模型在设置中变得清晰，但没有利用每个模型中提取的大部分特征资源。在本文中，我们提出了Multitask-Net模型，其动机是利用从人脸检测模型中提取的特征，与头部姿态估计分支共享以提高准确性。此外，由于数据的多样性，代表人脸的欧拉角域很大，我们的模型可以在360°欧拉角域内进行预测。multitask - net模型采用多任务学习方法，可以同时预测人类头部的位置和方向。为了提高预测模型头部方向的能力，我们将人脸的表示从欧拉角改为旋转矩阵的向量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Simultaneous face detection and 360 degree head pose estimation

With many practical applications in human life, including manufacturing surveillance cameras, analyzing and processing customer behavior, many researchers are noticing face detection and head pose estimation on digital images. A large number of proposed deep learning models have state-of-the-art accuracy such as YOLO, SSD, MTCNN, solving the problem of face detection or HopeNet, FSA-Net, RankPose model used for head pose estimation problem. According to many state-of-the-art methods, the pipeline of this task consists of two parts, from face detection to head pose estimation. These two steps are completely independent and do not share information. This makes the model clear in setup but does not leverage most of the featured resources extracted in each model. In this paper, we proposed the Multitask-Net model with the motivation to leverage the features extracted from the face detection model, sharing them with the head pose estimation branch to improve accuracy. Also, with the variety of data, the Euler angle domain representing the face is large, our model can predict with results in the 360° Euler angle domain. Applying the multitask learning method, the Multitask-Net model can simultaneously predict the position and direction of the human head. To increase the ability to predict the head direction of the model, we change the representation of the human face from the Euler angle to vectors of the Rotation matrix.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 13th International Conference on Knowledge and Systems Engineering (KSE)

自引率

0.00%

发文量