{"title":"Adaptive Visual Feedback Generation for Facial Expression Improvement with Multi-task Deep Neural Networks","authors":"Takuhiro Kaneko, Kaoru Hiramatsu, K. Kashino","doi":"10.1145/2964284.2967236","DOIUrl":null,"url":null,"abstract":"While many studies in computer vision and pattern recognition have been actively conducted to recognize people's current states, few studies have tackled the problem of generating feedback on how people can improve their states, although there are many real-world applications such as in sports, education, and health care. In particular, it has been challenging to develop such a system that can adaptively generate feedback for real-world situations, namely various input and target states, since it requires formulating various rules of feedback to do so. We propose a learning-based method to solve this problem. If we can obtain a large amount of feedback annotations, it is possible to explicitly learn the rules, but it is difficult to do so due to the subjective nature of the task. To mitigate this problem, our method implicitly learns the rules from training data consisting of input images, key-point annotations, and state annotations that do not require professional knowledge in feedback. Given such training data, we first learn a multi-task deep neural network with state recognition and key-point localization. Then, we apply a novel propagation method for extracting feedback information from the network. We evaluated our method in a facial expression improvement task using real-world data and clarified its characteristics and effectiveness.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2967236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
While many studies in computer vision and pattern recognition have been actively conducted to recognize people's current states, few studies have tackled the problem of generating feedback on how people can improve their states, although there are many real-world applications such as in sports, education, and health care. In particular, it has been challenging to develop such a system that can adaptively generate feedback for real-world situations, namely various input and target states, since it requires formulating various rules of feedback to do so. We propose a learning-based method to solve this problem. If we can obtain a large amount of feedback annotations, it is possible to explicitly learn the rules, but it is difficult to do so due to the subjective nature of the task. To mitigate this problem, our method implicitly learns the rules from training data consisting of input images, key-point annotations, and state annotations that do not require professional knowledge in feedback. Given such training data, we first learn a multi-task deep neural network with state recognition and key-point localization. Then, we apply a novel propagation method for extracting feedback information from the network. We evaluated our method in a facial expression improvement task using real-world data and clarified its characteristics and effectiveness.