基于听觉的噪声和混响语音增强单声特征

2017 International Conference on Computing Intelligence and Information System (CIIS) Pub Date : 2017-04-01 DOI:10.1109/CIIS.2017.23

Yi-jiao Jiang, Runsheng Liu, Ya Bai

{"title":"基于听觉的噪声和混响语音增强单声特征","authors":"Yi-jiao Jiang, Runsheng Liu, Ya Bai","doi":"10.1109/CIIS.2017.23","DOIUrl":null,"url":null,"abstract":"The deep neural networks (DNN) based speech enhancements is a hot topic in machine learning and speech enhancement application. Even with deep neural network, it is still hard to improve the speech quality on noisy and reverberant conditions. For machine learning based system, auditory feature extraction becomes the key point in speech enhancement and recognition. In this paper, we proposed a speech enhancement framework based on an auditory-based monaural feature, which model the function of human hearing auditory system. The auditory based feature is extracted from the data passing the gammatone filter banks, which has more detail on low frequency than normal filters. Systemic tests show the better performance of the proposed auditory based monaural feature than the mel-frequency cepstral coefficients (MFCC) in noise and reverberant environment.","PeriodicalId":254342,"journal":{"name":"2017 International Conference on Computing Intelligence and Information System (CIIS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Auditory-Based Monaural Feature for Noisy and Reverberant Speech Enhancement\",\"authors\":\"Yi-jiao Jiang, Runsheng Liu, Ya Bai\",\"doi\":\"10.1109/CIIS.2017.23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deep neural networks (DNN) based speech enhancements is a hot topic in machine learning and speech enhancement application. Even with deep neural network, it is still hard to improve the speech quality on noisy and reverberant conditions. For machine learning based system, auditory feature extraction becomes the key point in speech enhancement and recognition. In this paper, we proposed a speech enhancement framework based on an auditory-based monaural feature, which model the function of human hearing auditory system. The auditory based feature is extracted from the data passing the gammatone filter banks, which has more detail on low frequency than normal filters. Systemic tests show the better performance of the proposed auditory based monaural feature than the mel-frequency cepstral coefficients (MFCC) in noise and reverberant environment.\",\"PeriodicalId\":254342,\"journal\":{\"name\":\"2017 International Conference on Computing Intelligence and Information System (CIIS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Computing Intelligence and Information System (CIIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIIS.2017.23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Computing Intelligence and Information System (CIIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIIS.2017.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

基于深度神经网络(DNN)的语音增强是机器学习和语音增强应用中的一个热点。即使使用深度神经网络，在噪声和混响条件下仍然难以提高语音质量。对于基于机器学习的系统，听觉特征提取成为语音增强和识别的关键。本文提出了一种基于听觉的单声特征的语音增强框架，该框架模拟了人类听觉系统的功能。基于听觉的特征是从通过伽马酮滤波器组的数据中提取出来的，它比普通滤波器在低频上有更多的细节。系统测试表明，在噪声和混响环境下，基于听觉的单声特征比基于梅尔频倒谱系数(MFCC)的单声特征表现更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Auditory-Based Monaural Feature for Noisy and Reverberant Speech Enhancement

The deep neural networks (DNN) based speech enhancements is a hot topic in machine learning and speech enhancement application. Even with deep neural network, it is still hard to improve the speech quality on noisy and reverberant conditions. For machine learning based system, auditory feature extraction becomes the key point in speech enhancement and recognition. In this paper, we proposed a speech enhancement framework based on an auditory-based monaural feature, which model the function of human hearing auditory system. The auditory based feature is extracted from the data passing the gammatone filter banks, which has more detail on low frequency than normal filters. Systemic tests show the better performance of the proposed auditory based monaural feature than the mel-frequency cepstral coefficients (MFCC) in noise and reverberant environment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Computing Intelligence and Information System (CIIS)

自引率

0.00%

发文量