Efficient feature extraction of speaker identification using phoneme mean F-ratio for Chinese

Chen Zhao, Hongcui Wang, Songgun Hyon, Jianguo Wei, J. Dang
{"title":"Efficient feature extraction of speaker identification using phoneme mean F-ratio for Chinese","authors":"Chen Zhao, Hongcui Wang, Songgun Hyon, Jianguo Wei, J. Dang","doi":"10.1109/ISCSLP.2012.6423485","DOIUrl":null,"url":null,"abstract":"The features used for speaker recognition should have more speaker individual information while attenuating the linguistic information. In order to discard the linguistic information effectively, in this paper, we employed the phoneme mean F-ratio method to investigate the different contributions of different frequency region from the point of view of Chinese phoneme, and apply it for speaker identification. It is found that the speaker individual information depending on the phonemes is distributed in different frequency regions of speech sound. Based on the contribution rate, we extracted the new features and combined with GMM model. The experiment for speaker identification task is conducted with a King-ASR Chinese database. Compared with the MFCC feature, the identification error rate with the proposed feature was reduced by 32.94%. The results confirmed that the efficiency of the phoneme mean F-ratio method for improving speaker recognition performance for Chinese.","PeriodicalId":186099,"journal":{"name":"2012 8th International Symposium on Chinese Spoken Language Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 8th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP.2012.6423485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

The features used for speaker recognition should have more speaker individual information while attenuating the linguistic information. In order to discard the linguistic information effectively, in this paper, we employed the phoneme mean F-ratio method to investigate the different contributions of different frequency region from the point of view of Chinese phoneme, and apply it for speaker identification. It is found that the speaker individual information depending on the phonemes is distributed in different frequency regions of speech sound. Based on the contribution rate, we extracted the new features and combined with GMM model. The experiment for speaker identification task is conducted with a King-ASR Chinese database. Compared with the MFCC feature, the identification error rate with the proposed feature was reduced by 32.94%. The results confirmed that the efficiency of the phoneme mean F-ratio method for improving speaker recognition performance for Chinese.
基于音素平均f比的汉语说话人识别高效特征提取
用于说话人识别的特征应该在衰减语言信息的同时包含更多的说话人个体信息。为了有效地剔除语言信息,本文采用音素均值f比方法,从汉语音素的角度考察不同频率区域的不同贡献,并将其应用于说话人识别。研究发现,说话人的个体信息根据音素分布在语音的不同频率区域。基于贡献率提取新特征,并与GMM模型相结合。使用King-ASR中文数据库进行说话人识别实验。与MFCC特征相比,该特征的识别错误率降低了32.94%。结果证实了音素平均f比法提高汉语说话人识别性能的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信