{"title":"基于谱熵距离的含噪语音质量估计","authors":"Gabriel Mittag, S. Möller","doi":"10.1109/ICT.2019.8798783","DOIUrl":null,"url":null,"abstract":"In this paper, we propose to use spectral entropy distance as a new measure for objective quality estimations of noisy speech. While the perceived quality estimation of a transmitted speech signal under background noise is fairly straight forward, the estimation of noise on active speech is more complex. For example, an increase in loudness can be confused as noise by common quality measures. Also, other distortions, such as interruptions due to packet loss, can decrease the energy in the degraded signal and thus lead to an underestimation of the noisiness. This is especially critical when the noise is only present during active speech segments, as it is the case for quantization noise caused by low bitrate codecs or voice activity detections at the receiver side. The spectral entropy, however, only considers the frequency composition of a signal and does not depend on the signal energy. Therefore, it gives a robust measure of how noisy a signal is in the presence of active speech. In our experiments, we trained a prediction model based on the spectral entropy and obtained excellent prediction results that show that the spectral entropy distance is indeed a useful tool for the quality estimation of noisy speech.","PeriodicalId":127412,"journal":{"name":"2019 26th International Conference on Telecommunications (ICT)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Quality Estimation of Noisy Speech Using Spectral Entropy Distance\",\"authors\":\"Gabriel Mittag, S. Möller\",\"doi\":\"10.1109/ICT.2019.8798783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose to use spectral entropy distance as a new measure for objective quality estimations of noisy speech. While the perceived quality estimation of a transmitted speech signal under background noise is fairly straight forward, the estimation of noise on active speech is more complex. For example, an increase in loudness can be confused as noise by common quality measures. Also, other distortions, such as interruptions due to packet loss, can decrease the energy in the degraded signal and thus lead to an underestimation of the noisiness. This is especially critical when the noise is only present during active speech segments, as it is the case for quantization noise caused by low bitrate codecs or voice activity detections at the receiver side. The spectral entropy, however, only considers the frequency composition of a signal and does not depend on the signal energy. Therefore, it gives a robust measure of how noisy a signal is in the presence of active speech. In our experiments, we trained a prediction model based on the spectral entropy and obtained excellent prediction results that show that the spectral entropy distance is indeed a useful tool for the quality estimation of noisy speech.\",\"PeriodicalId\":127412,\"journal\":{\"name\":\"2019 26th International Conference on Telecommunications (ICT)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 26th International Conference on Telecommunications (ICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICT.2019.8798783\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th International Conference on Telecommunications (ICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICT.2019.8798783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Quality Estimation of Noisy Speech Using Spectral Entropy Distance
In this paper, we propose to use spectral entropy distance as a new measure for objective quality estimations of noisy speech. While the perceived quality estimation of a transmitted speech signal under background noise is fairly straight forward, the estimation of noise on active speech is more complex. For example, an increase in loudness can be confused as noise by common quality measures. Also, other distortions, such as interruptions due to packet loss, can decrease the energy in the degraded signal and thus lead to an underestimation of the noisiness. This is especially critical when the noise is only present during active speech segments, as it is the case for quantization noise caused by low bitrate codecs or voice activity detections at the receiver side. The spectral entropy, however, only considers the frequency composition of a signal and does not depend on the signal energy. Therefore, it gives a robust measure of how noisy a signal is in the presence of active speech. In our experiments, we trained a prediction model based on the spectral entropy and obtained excellent prediction results that show that the spectral entropy distance is indeed a useful tool for the quality estimation of noisy speech.