构建印尼viseme:基于聚类的方法

2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM) Pub Date : 2013-12-01 DOI:10.1109/CYBERNETICSCOM.2013.6865781

Arifin, Muljono, S. Sumpeno, M. Hariadi

{"title":"构建印尼viseme:基于聚类的方法","authors":"Arifin, Muljono, S. Sumpeno, M. Hariadi","doi":"10.1109/CYBERNETICSCOM.2013.6865781","DOIUrl":null,"url":null,"abstract":"Lips animation plays an important role in facial animation. A realistic lips animation requires synchronization of viseme (visual phoneme) with the spoken phonemes. This research aims towards building Indonesian viseme by configuring viseme classes based on the clustering process result of visual speech images data. The research used Subspace LDA, which is a combination of Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA), as the extraction feature method. The Subspace LDA method is expected to be able to produce an optimal dimension reduction. The clustering process utilized K-Means algorithms to split data into a number of clusters. The quality of clustering result is measured by using Sum of Squared Error (SSE) and a ratio of Between-Class Variation (BCV) and Within-Class Variation (WCV). From these measurements, we found that the best quality clustering occurs at k=9. The finding of this research is the Indonesian viseme consisting of 10 classes (9 classes of clustering result and one neutral class). For a future work, the result of this research can be used as a reference to the Indonesian viseme structure that is defined based on linguistic knowledge.","PeriodicalId":351051,"journal":{"name":"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Towards building Indonesian viseme: A clustering-based approach\",\"authors\":\"Arifin, Muljono, S. Sumpeno, M. Hariadi\",\"doi\":\"10.1109/CYBERNETICSCOM.2013.6865781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lips animation plays an important role in facial animation. A realistic lips animation requires synchronization of viseme (visual phoneme) with the spoken phonemes. This research aims towards building Indonesian viseme by configuring viseme classes based on the clustering process result of visual speech images data. The research used Subspace LDA, which is a combination of Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA), as the extraction feature method. The Subspace LDA method is expected to be able to produce an optimal dimension reduction. The clustering process utilized K-Means algorithms to split data into a number of clusters. The quality of clustering result is measured by using Sum of Squared Error (SSE) and a ratio of Between-Class Variation (BCV) and Within-Class Variation (WCV). From these measurements, we found that the best quality clustering occurs at k=9. The finding of this research is the Indonesian viseme consisting of 10 classes (9 classes of clustering result and one neutral class). For a future work, the result of this research can be used as a reference to the Indonesian viseme structure that is defined based on linguistic knowledge.\",\"PeriodicalId\":351051,\"journal\":{\"name\":\"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CYBERNETICSCOM.2013.6865781\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERNETICSCOM.2013.6865781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

嘴唇动画在面部动画中占有重要的地位。一个逼真的嘴唇动画需要视觉音素与口语音素的同步。本研究旨在基于视觉语音图像数据的聚类处理结果，通过配置viseme类来构建印尼语viseme。本研究采用主成分分析(PCA)和线性判别分析(LDA)相结合的子空间LDA作为提取特征的方法。子空间LDA方法有望产生最优的降维。聚类过程使用K-Means算法将数据分成多个聚类。采用误差平方和(SSE)和类间变异(BCV)与类内变异(WCV)之比来衡量聚类结果的质量。从这些测量中，我们发现k=9时出现了最佳质量的聚类。本研究的发现是印度尼西亚viseme由10个类组成(聚类结果9类和中性类1类)。对于未来的工作，本研究的结果可以作为基于语言知识定义的印尼语视素结构的参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards building Indonesian viseme: A clustering-based approach

Lips animation plays an important role in facial animation. A realistic lips animation requires synchronization of viseme (visual phoneme) with the spoken phonemes. This research aims towards building Indonesian viseme by configuring viseme classes based on the clustering process result of visual speech images data. The research used Subspace LDA, which is a combination of Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA), as the extraction feature method. The Subspace LDA method is expected to be able to produce an optimal dimension reduction. The clustering process utilized K-Means algorithms to split data into a number of clusters. The quality of clustering result is measured by using Sum of Squared Error (SSE) and a ratio of Between-Class Variation (BCV) and Within-Class Variation (WCV). From these measurements, we found that the best quality clustering occurs at k=9. The finding of this research is the Indonesian viseme consisting of 10 classes (9 classes of clustering result and one neutral class). For a future work, the result of this research can be used as a reference to the Indonesian viseme structure that is defined based on linguistic knowledge.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE International Conference on Computational Intelligence and Cybernetics (CYBERNETICSCOM)

自引率

0.00%

发文量