An Advance on Gender Classification by Information Preserving Features

K. Kuppusamy, C. Eswaran
{"title":"An Advance on Gender Classification by Information Preserving Features","authors":"K. Kuppusamy, C. Eswaran","doi":"10.1145/3277453.3277462","DOIUrl":null,"url":null,"abstract":"One of the most challenging issues in Speaker's Gender Classification (SGC) is feature extraction. Since, it degrades the classification accuracy due to information loss during features extraction using speech signals. In previous researches, Perceptual Linear Prediction (PLP) coefficients were extracted by using Blackman windowing method along with the other features of speech signal to improve the classification accuracy. However, still some information was lost at those window edges which degrade the recognition accuracy and also more efficient features were required to improve the classification performance. Hence in this paper, SGC is improved by extracting the PLP coefficients based on novel windowing technique. In this technique, initially type-1 features such as spectral and prosodic features of speech signal are extracted. In addition, Information Preserving Perceptual Linear Prediction (IPPLP) coefficients are also extracted using Slepian windowing method. Moreover, the frequency-dependent transmission characteristics of the outer ear are compensated based on the analysis of time-varying Equal Loudness Contour (ELC) curves and Peak-to-Loudness Ratio (PLR). After that, the extracted IPPLP features are fused with type-1 features and classified by using different combinations of classifiers like Gaussian Mixture Model (GMM), Support Vector Machine (SVM) and GMM supervectors-based SVM at score level fusion scheme. According to the final classification result, the type of speaker's gender is recognized. Finally, the experimental results show the significant improvements on classification accuracy by using proposed classification technique. With the proposed speaker's gender classification technique, the classification accuracy values are obtained 38.55%, 62.65% and 69.88% in GMM, SVM and GMM-SVM classification, respectively.","PeriodicalId":186835,"journal":{"name":"Proceedings of the 2018 International Conference on Electronics and Electrical Engineering Technology","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 International Conference on Electronics and Electrical Engineering Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3277453.3277462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

One of the most challenging issues in Speaker's Gender Classification (SGC) is feature extraction. Since, it degrades the classification accuracy due to information loss during features extraction using speech signals. In previous researches, Perceptual Linear Prediction (PLP) coefficients were extracted by using Blackman windowing method along with the other features of speech signal to improve the classification accuracy. However, still some information was lost at those window edges which degrade the recognition accuracy and also more efficient features were required to improve the classification performance. Hence in this paper, SGC is improved by extracting the PLP coefficients based on novel windowing technique. In this technique, initially type-1 features such as spectral and prosodic features of speech signal are extracted. In addition, Information Preserving Perceptual Linear Prediction (IPPLP) coefficients are also extracted using Slepian windowing method. Moreover, the frequency-dependent transmission characteristics of the outer ear are compensated based on the analysis of time-varying Equal Loudness Contour (ELC) curves and Peak-to-Loudness Ratio (PLR). After that, the extracted IPPLP features are fused with type-1 features and classified by using different combinations of classifiers like Gaussian Mixture Model (GMM), Support Vector Machine (SVM) and GMM supervectors-based SVM at score level fusion scheme. According to the final classification result, the type of speaker's gender is recognized. Finally, the experimental results show the significant improvements on classification accuracy by using proposed classification technique. With the proposed speaker's gender classification technique, the classification accuracy values are obtained 38.55%, 62.65% and 69.88% in GMM, SVM and GMM-SVM classification, respectively.
基于信息保持特征的性别分类研究进展
说话人性别分类中最具挑战性的问题之一是特征提取。由于在语音信号特征提取过程中存在信息丢失,降低了分类精度。在以往的研究中,为了提高分类精度,将感知线性预测(PLP)系数与语音信号的其他特征结合使用Blackman加窗方法进行提取。然而,在这些窗口边缘仍然会丢失一些信息,从而降低识别的准确性,并且需要更有效的特征来提高分类性能。因此,本文采用新的加窗技术提取PLP系数,对SGC进行了改进。在该技术中,首先提取语音信号的谱特征和韵律特征等一类特征。此外,还采用Slepian加窗法提取了信息保持感知线性预测(IPPLP)系数。此外,基于时变等响度轮廓(ELC)曲线和峰响度比(PLR)分析,对外耳频率相关的传输特性进行了补偿。然后,将提取的IPPLP特征与type-1特征融合,在分数级融合方案中使用高斯混合模型(GMM)、支持向量机(SVM)和基于GMM超向量的支持向量机(SVM)的不同分类器组合进行分类。根据最终的分类结果,识别说话人的性别类型。最后,实验结果表明,采用本文提出的分类技术可以显著提高分类精度。采用本文提出的说话人性别分类技术,在GMM、SVM和GMM-SVM分类中分别获得38.55%、62.65%和69.88%的分类准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信