{"title":"从视频中识别说话状态的视觉语音特征","authors":"Xibin Jia, Baocai Yin, Yanfeng Sun","doi":"10.1109/ICMULT.2010.5629829","DOIUrl":null,"url":null,"abstract":"The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.","PeriodicalId":412601,"journal":{"name":"2010 International Conference on Multimedia Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Visual Speech Feature to Indentify the Speaking States from Video\",\"authors\":\"Xibin Jia, Baocai Yin, Yanfeng Sun\",\"doi\":\"10.1109/ICMULT.2010.5629829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.\",\"PeriodicalId\":412601,\"journal\":{\"name\":\"2010 International Conference on Multimedia Technology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Multimedia Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMULT.2010.5629829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Multimedia Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMULT.2010.5629829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Visual Speech Feature to Indentify the Speaking States from Video
The paper proposes a kind of visual speech feature for the speaking mouth images from the video combining clues of the shape and local teeth texture. The geometric feature we proposed based on the computing the Euclidian distant between each the feature point around the inner and outer lip. The local texture with G and B components as baseline is employed to calculate the color moment to describe the visibility of teeth. The weighted fusion is used to combine the two features. The k-mean algorithm is utilized to analyze the feature performance according to evaluate the clustering results. The results show that with G and B color component to derive the local texture to model the teeth visibility are better than the others and our feature has higher ability to perceive the visemes than the PCA and geometric feature only.