从视频中识别说话状态的视觉语音特征

2010 International Conference on Multimedia Technology Pub Date : 2010-11-11 DOI:10.1109/ICMULT.2010.5629829

Xibin Jia, Baocai Yin, Yanfeng Sun

{"title":"从视频中识别说话状态的视觉语音特征","authors":"Xibin Jia, Baocai Yin, Yanfeng Sun","doi":"10.1109/ICMULT.2010.5629829","DOIUrl":null,"url":null,"abstract":"The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.","PeriodicalId":412601,"journal":{"name":"2010 International Conference on Multimedia Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Visual Speech Feature to Indentify the Speaking States from Video\",\"authors\":\"Xibin Jia, Baocai Yin, Yanfeng Sun\",\"doi\":\"10.1109/ICMULT.2010.5629829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.\",\"PeriodicalId\":412601,\"journal\":{\"name\":\"2010 International Conference on Multimedia Technology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Multimedia Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMULT.2010.5629829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Multimedia Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMULT.2010.5629829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种结合形状线索和局部牙齿纹理线索的视频说话嘴图像视觉语音特征。我们提出的几何特征是基于计算内外唇周围每个特征点之间的欧几里德距离。采用以G和B分量为基线的局部纹理计算颜色矩来描述牙齿的可见性。采用加权融合的方法将两个特征结合起来。利用k-mean算法对聚类结果进行评价，分析特征性能。结果表明，用G和B颜色分量提取局部纹理来建模牙齿可见性的效果优于其他方法，并且我们的特征对牙齿可见性的感知能力优于单纯的主成分分析和几何特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Visual Speech Feature to Indentify the Speaking States from Video

The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 International Conference on Multimedia Technology

自引率

0.00%

发文量