Ryosuke Nishimura, Nobuchika Sakata, T. Tominaga, Y. Hijikata, K. Harada, K. Kiyokawa
{"title":"Speech-Driven Facial Animation by LSTM-RNN for Communication Use","authors":"Ryosuke Nishimura, Nobuchika Sakata, T. Tominaga, Y. Hijikata, K. Harada, K. Kiyokawa","doi":"10.1109/APMAR.2019.8709162","DOIUrl":null,"url":null,"abstract":"The goal of this research is developing a system that a rich facial animation can be used in communication is generated from only speech. A high-quality 3DCG character to be used in VR is quite popular. Also, the facial animation is important in enhancing the reality of the high-quality 3DCG character. Generally, a source of the generating facial animation is a camera. The problem is one of selecting optical cameras as an input. Using cameras as an input source, it causes limitations of the angle of view of the camera or problems that cannot be aware of the human face, depending on the orientation of the face. Therefore, it is reasonable for developing a system for generating a facial animation using only voice. In this study, we generate facial expressions from only speech using LSTM-RNN and showed the usefulness of the system in communication via user study.","PeriodicalId":156273,"journal":{"name":"2019 12th Asia Pacific Workshop on Mixed and Augmented Reality (APMAR)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 12th Asia Pacific Workshop on Mixed and Augmented Reality (APMAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APMAR.2019.8709162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The goal of this research is developing a system that a rich facial animation can be used in communication is generated from only speech. A high-quality 3DCG character to be used in VR is quite popular. Also, the facial animation is important in enhancing the reality of the high-quality 3DCG character. Generally, a source of the generating facial animation is a camera. The problem is one of selecting optical cameras as an input. Using cameras as an input source, it causes limitations of the angle of view of the camera or problems that cannot be aware of the human face, depending on the orientation of the face. Therefore, it is reasonable for developing a system for generating a facial animation using only voice. In this study, we generate facial expressions from only speech using LSTM-RNN and showed the usefulness of the system in communication via user study.