使用近似熵作为说话人识别系统的语音质量度量

2016 Annual Conference on Information Science and Systems (CISS) Pub Date : 2016-03-16 DOI:10.1109/CISS.2016.7460517

R. A. Metzger, J. Doherty, D. Jenkins

{"title":"使用近似熵作为说话人识别系统的语音质量度量","authors":"R. A. Metzger, J. Doherty, D. Jenkins","doi":"10.1109/CISS.2016.7460517","DOIUrl":null,"url":null,"abstract":"In this paper, we will show that Approximate Entropy (ApEn) can be used to detect high-quality speech frames in an otherwise distorted speech signal. By exploiting the property of quasi-periodicity in speech, ApEn is able to detect small aberrations in speech frames that would otherwise cause a decrease in the performance in an automatic speaker recognition (ASR) system. In addition, we obtain the statistics of ApEn values representative of clean speech and propose threshold bounds to obtain maximum recognition rates. When compared to other popular voice activity detector (VAD) algorithms, our simulation results showed that utilization of ApEn will outperform the other VADs in discerning clean speech from noisy speech. This ability to properly detect clean speech allows for a speaker recognition system to obtain a recognition rate close to 87%, which is close to the same performance of the system when noise is not present.","PeriodicalId":346776,"journal":{"name":"2016 Annual Conference on Information Science and Systems (CISS)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Using Approximate Entropy as a speech quality measure for a speaker recognition system\",\"authors\":\"R. A. Metzger, J. Doherty, D. Jenkins\",\"doi\":\"10.1109/CISS.2016.7460517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we will show that Approximate Entropy (ApEn) can be used to detect high-quality speech frames in an otherwise distorted speech signal. By exploiting the property of quasi-periodicity in speech, ApEn is able to detect small aberrations in speech frames that would otherwise cause a decrease in the performance in an automatic speaker recognition (ASR) system. In addition, we obtain the statistics of ApEn values representative of clean speech and propose threshold bounds to obtain maximum recognition rates. When compared to other popular voice activity detector (VAD) algorithms, our simulation results showed that utilization of ApEn will outperform the other VADs in discerning clean speech from noisy speech. This ability to properly detect clean speech allows for a speaker recognition system to obtain a recognition rate close to 87%, which is close to the same performance of the system when noise is not present.\",\"PeriodicalId\":346776,\"journal\":{\"name\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"volume\":\"105 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Annual Conference on Information Science and Systems (CISS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISS.2016.7460517\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Annual Conference on Information Science and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS.2016.7460517","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

在本文中，我们将展示近似熵(ApEn)可以用于在扭曲的语音信号中检测高质量的语音帧。通过利用语音的准周期性特性，ApEn能够检测语音帧中的小畸变，否则会导致自动说话人识别(ASR)系统的性能下降。此外，我们获得了代表干净语音的ApEn值的统计数据，并提出了获得最大识别率的阈值界限。与其他流行的语音活动检测器(VAD)算法相比，我们的仿真结果表明，使用ApEn在区分干净语音和嘈杂语音方面优于其他VAD算法。这种正确检测干净语音的能力允许说话人识别系统获得接近87%的识别率，这接近于不存在噪声时系统的相同性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Approximate Entropy as a speech quality measure for a speaker recognition system

In this paper, we will show that Approximate Entropy (ApEn) can be used to detect high-quality speech frames in an otherwise distorted speech signal. By exploiting the property of quasi-periodicity in speech, ApEn is able to detect small aberrations in speech frames that would otherwise cause a decrease in the performance in an automatic speaker recognition (ASR) system. In addition, we obtain the statistics of ApEn values representative of clean speech and propose threshold bounds to obtain maximum recognition rates. When compared to other popular voice activity detector (VAD) algorithms, our simulation results showed that utilization of ApEn will outperform the other VADs in discerning clean speech from noisy speech. This ability to properly detect clean speech allows for a speaker recognition system to obtain a recognition rate close to 87%, which is close to the same performance of the system when noise is not present.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 Annual Conference on Information Science and Systems (CISS)

自引率

0.00%

发文量