{"title":"基于高斯混合模型(GMM)的视听发音反演","authors":"I. Ozbek, M. Demirekler","doi":"10.1109/SIU.2010.5653987","DOIUrl":null,"url":null,"abstract":"In this study, we examined articulatory inversion using audiovisual information based on Gaussian Mixture Model (GMM). In this method the joint distribution of the articulatory movement and audio (and/or visual) data are modelled via a mixture of Gaussians. The conditional expected value of the GMM is used as regression function between the audio (and/orvisual) and ar-ticulatory spaces. We also examined various fusion methods in order to combine acoustic and visual information in articula-tory inversion. The fusion methods improve the performance of articulatory inversion.","PeriodicalId":152297,"journal":{"name":"2010 IEEE 18th Signal Processing and Communications Applications Conference","volume":"18 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Audiovisual articulatory inversion based on Gaussian Mixture Model (GMM)\",\"authors\":\"I. Ozbek, M. Demirekler\",\"doi\":\"10.1109/SIU.2010.5653987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we examined articulatory inversion using audiovisual information based on Gaussian Mixture Model (GMM). In this method the joint distribution of the articulatory movement and audio (and/or visual) data are modelled via a mixture of Gaussians. The conditional expected value of the GMM is used as regression function between the audio (and/orvisual) and ar-ticulatory spaces. We also examined various fusion methods in order to combine acoustic and visual information in articula-tory inversion. The fusion methods improve the performance of articulatory inversion.\",\"PeriodicalId\":152297,\"journal\":{\"name\":\"2010 IEEE 18th Signal Processing and Communications Applications Conference\",\"volume\":\"18 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 18th Signal Processing and Communications Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIU.2010.5653987\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 18th Signal Processing and Communications Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU.2010.5653987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Audiovisual articulatory inversion based on Gaussian Mixture Model (GMM)
In this study, we examined articulatory inversion using audiovisual information based on Gaussian Mixture Model (GMM). In this method the joint distribution of the articulatory movement and audio (and/or visual) data are modelled via a mixture of Gaussians. The conditional expected value of the GMM is used as regression function between the audio (and/orvisual) and ar-ticulatory spaces. We also examined various fusion methods in order to combine acoustic and visual information in articula-tory inversion. The fusion methods improve the performance of articulatory inversion.