{"title":"基于GMM和聚类方法的语音带宽扩展","authors":"Yingxue Wang, Shenghui Zhao, Yibiao Yu, Jingming Kuang","doi":"10.1109/CSNT.2015.233","DOIUrl":null,"url":null,"abstract":"Conventional Gaussian mixture model (GMM) Speech Bandwidth Extension (BWE) methods often suffer from the overly smoothed problem. Thus, a method of BWE based on a cluster process and GMM whose parameters are determined by expectation-Maximization (EM) is proposed. Firstly, a cluster process is used to cluster the low frequency and high frequency parameters, and then the GMM for each cluster is established. Later on, the parameters of low frequency are transformed to the parameters of high frequency according to the learned mapping function of the corresponding GMM. Self-organization Feature Mapping (SOFM) and Vector Quantization (VQ) are applied as the cluster. It is shown by subjective evaluation and objective evaluation that, the proposed method improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method and overcomes the over-smoothed problem caused by the traditional GMM-based BWE method largely.","PeriodicalId":334733,"journal":{"name":"2015 Fifth International Conference on Communication Systems and Network Technologies","volume":"177 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Speech Bandwidth Extension Based on GMM and Clustering Method\",\"authors\":\"Yingxue Wang, Shenghui Zhao, Yibiao Yu, Jingming Kuang\",\"doi\":\"10.1109/CSNT.2015.233\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conventional Gaussian mixture model (GMM) Speech Bandwidth Extension (BWE) methods often suffer from the overly smoothed problem. Thus, a method of BWE based on a cluster process and GMM whose parameters are determined by expectation-Maximization (EM) is proposed. Firstly, a cluster process is used to cluster the low frequency and high frequency parameters, and then the GMM for each cluster is established. Later on, the parameters of low frequency are transformed to the parameters of high frequency according to the learned mapping function of the corresponding GMM. Self-organization Feature Mapping (SOFM) and Vector Quantization (VQ) are applied as the cluster. It is shown by subjective evaluation and objective evaluation that, the proposed method improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method and overcomes the over-smoothed problem caused by the traditional GMM-based BWE method largely.\",\"PeriodicalId\":334733,\"journal\":{\"name\":\"2015 Fifth International Conference on Communication Systems and Network Technologies\",\"volume\":\"177 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 Fifth International Conference on Communication Systems and Network Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSNT.2015.233\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 Fifth International Conference on Communication Systems and Network Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNT.2015.233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech Bandwidth Extension Based on GMM and Clustering Method
Conventional Gaussian mixture model (GMM) Speech Bandwidth Extension (BWE) methods often suffer from the overly smoothed problem. Thus, a method of BWE based on a cluster process and GMM whose parameters are determined by expectation-Maximization (EM) is proposed. Firstly, a cluster process is used to cluster the low frequency and high frequency parameters, and then the GMM for each cluster is established. Later on, the parameters of low frequency are transformed to the parameters of high frequency according to the learned mapping function of the corresponding GMM. Self-organization Feature Mapping (SOFM) and Vector Quantization (VQ) are applied as the cluster. It is shown by subjective evaluation and objective evaluation that, the proposed method improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method and overcomes the over-smoothed problem caused by the traditional GMM-based BWE method largely.