Text-Independent Speaker Identification using Mel-Frequency Energy Coefficients and Convolutional Neural Networks

Déhia Abdiche, K. Harrar
{"title":"Text-Independent Speaker Identification using Mel-Frequency Energy Coefficients and Convolutional Neural Networks","authors":"Déhia Abdiche, K. Harrar","doi":"10.1109/IHSH51661.2021.9378726","DOIUrl":null,"url":null,"abstract":"Automatic Speaker Identification (ASI) is a biometric technique, which had achieved reliability in real applications, with standard feature extraction methods such as Linear Predictive Cepstral Coefficients (LPCC), Perceptual Linear Prediction (PLP), and modeling methods such as Gaussian mixture model (GMM), etc. However, the success of these manual approaches was quickly hampered by the emergence of big data, and the inability of scientists to manipulate large amounts of data, which led researchers to move towards automatic methods such as deep neural networks. In this work, a Convolutional Neural Network (CNN) is suggested for speaker identification in text-independent mode. Mel-Frequency Energy Coefficients (MFEC) method was used for extracting the characteristics of audio signals and the obtained coefficients were injected into the convolutional neural network model for classification (identification). In addition, a comparison was made between the proposed method and the existing traditional methods. Experimental results show that the proposed structure resulted in a speaker identification rate of 97.89%, which is much higher than the rates obtained in the old state of the art methods.","PeriodicalId":127735,"journal":{"name":"2020 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Workshop on Human-Centric Smart Environments for Health and Well-being (IHSH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IHSH51661.2021.9378726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic Speaker Identification (ASI) is a biometric technique, which had achieved reliability in real applications, with standard feature extraction methods such as Linear Predictive Cepstral Coefficients (LPCC), Perceptual Linear Prediction (PLP), and modeling methods such as Gaussian mixture model (GMM), etc. However, the success of these manual approaches was quickly hampered by the emergence of big data, and the inability of scientists to manipulate large amounts of data, which led researchers to move towards automatic methods such as deep neural networks. In this work, a Convolutional Neural Network (CNN) is suggested for speaker identification in text-independent mode. Mel-Frequency Energy Coefficients (MFEC) method was used for extracting the characteristics of audio signals and the obtained coefficients were injected into the convolutional neural network model for classification (identification). In addition, a comparison was made between the proposed method and the existing traditional methods. Experimental results show that the proposed structure resulted in a speaker identification rate of 97.89%, which is much higher than the rates obtained in the old state of the art methods.
基于mel频率能量系数和卷积神经网络的文本无关说话人识别
自动说话人识别(ASI)是一种生物识别技术,在实际应用中已经取得了一定的可靠性,其标准特征提取方法如线性预测倒谱系数(LPCC)、感知线性预测(PLP)和建模方法如高斯混合模型(GMM)等。然而,这些人工方法的成功很快受到大数据出现的阻碍,科学家无法操纵大量数据,这导致研究人员转向自动方法,如深度神经网络。在这项工作中,提出了一种卷积神经网络(CNN)用于文本独立模式下的说话人识别。采用Mel-Frequency Energy Coefficients (MFEC)方法提取音频信号的特征,并将得到的系数注入卷积神经网络模型中进行分类(识别)。此外,还将该方法与现有的传统方法进行了比较。实验结果表明,该结构的说话人识别率达到97.89%,大大高于现有方法的识别率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信