基于性别感知暹罗-三重深度神经网络的鲁棒文本依赖说话人验证系统。

IF 1.1 3区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Sanghamitra V Arora
{"title":"基于性别感知暹罗-三重深度神经网络的鲁棒文本依赖说话人验证系统。","authors":"Sanghamitra V Arora","doi":"10.1080/0954898X.2024.2438128","DOIUrl":null,"url":null,"abstract":"<p><p>Speaker verification in text-dependent scenarios is critical for high-security applications but faces challenges such as voice quality variations, linguistic diversity, and gender-related pitch differences, which affect authentication accuracy. This paper introduces a Gender-Aware Siamese-Triplet Network-Deep Neural Network (ST-DNN) architecture to address these challenges. The Gender-Aware Network utilizes Convolutional 2D layers with ReLU activation for initial feature extraction, followed by multi-fusion dense skip connections and batch normalization to integrate features across different depths, enhancing discrimination between male and female speakers. A bottleneck layer compresses feature maps to capture gender-related characteristics effectively. For enhanced speaker verification, separate male and female ST-DNN models are used, each incorporating Individual, Siamese, and Triplet Networks. The Individual Network extracts unique utterance characteristics, the Siamese Network compares speech sample pairs for speaker identity, and the Triplet Network ensures closely grouped embeddings of samples from the same speaker, facilitating precise verification. Experimental results on RSR2015 and RedDots Challenge 2016 datasets demonstrate significant improvements, with reductions in Equal Error Rate (EER) ranging from 32.31% to 54.55% for males and 33.73% to 38.98% for females, and reductions in MinDCF from 53.47% to 86.36% and 39.46% to 71.19%, respectively, validating the efficacy of the ST-DNN in real-world applications.</p>","PeriodicalId":54735,"journal":{"name":"Network-Computation in Neural Systems","volume":" ","pages":"1-40"},"PeriodicalIF":1.1000,"publicationDate":"2024-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust text-dependent speaker verification system using gender aware Siamese-Triplet Deep Neural Network.\",\"authors\":\"Sanghamitra V Arora\",\"doi\":\"10.1080/0954898X.2024.2438128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Speaker verification in text-dependent scenarios is critical for high-security applications but faces challenges such as voice quality variations, linguistic diversity, and gender-related pitch differences, which affect authentication accuracy. This paper introduces a Gender-Aware Siamese-Triplet Network-Deep Neural Network (ST-DNN) architecture to address these challenges. The Gender-Aware Network utilizes Convolutional 2D layers with ReLU activation for initial feature extraction, followed by multi-fusion dense skip connections and batch normalization to integrate features across different depths, enhancing discrimination between male and female speakers. A bottleneck layer compresses feature maps to capture gender-related characteristics effectively. For enhanced speaker verification, separate male and female ST-DNN models are used, each incorporating Individual, Siamese, and Triplet Networks. The Individual Network extracts unique utterance characteristics, the Siamese Network compares speech sample pairs for speaker identity, and the Triplet Network ensures closely grouped embeddings of samples from the same speaker, facilitating precise verification. Experimental results on RSR2015 and RedDots Challenge 2016 datasets demonstrate significant improvements, with reductions in Equal Error Rate (EER) ranging from 32.31% to 54.55% for males and 33.73% to 38.98% for females, and reductions in MinDCF from 53.47% to 86.36% and 39.46% to 71.19%, respectively, validating the efficacy of the ST-DNN in real-world applications.</p>\",\"PeriodicalId\":54735,\"journal\":{\"name\":\"Network-Computation in Neural Systems\",\"volume\":\" \",\"pages\":\"1-40\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2024-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Network-Computation in Neural Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1080/0954898X.2024.2438128\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Network-Computation in Neural Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1080/0954898X.2024.2438128","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

文本依赖场景中的说话人验证对于高安全性应用至关重要,但面临语音质量变化、语言多样性和性别相关的音高差异等挑战,这些都会影响身份验证的准确性。本文介绍了一种性别感知连体-三重网络-深度神经网络(ST-DNN)架构来解决这些挑战。性别感知网络利用具有ReLU激活的卷积二维层进行初始特征提取,然后通过多融合密集跳跃连接和批处理归一化来整合不同深度的特征,增强对男性和女性说话者的区分。瓶颈层压缩特征映射以有效捕获与性别相关的特征。为了增强说话者验证,使用单独的男性和女性ST-DNN模型,每个模型都包含个人,连体和三重网络。个体网络提取独特的话语特征,连体网络比较语音样本对以确定说话者身份,而三重网络确保来自同一说话者的样本紧密分组嵌入,从而促进精确验证。在RSR2015和RedDots Challenge 2016数据集上的实验结果显示了显著的改进,男性的等效错误率(EER)从32.31%降低到54.55%,女性从33.73%降低到38.98%,MinDCF从53.47%降低到86.36%,从39.46%降低到71.19%,验证了ST-DNN在实际应用中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Robust text-dependent speaker verification system using gender aware Siamese-Triplet Deep Neural Network.

Speaker verification in text-dependent scenarios is critical for high-security applications but faces challenges such as voice quality variations, linguistic diversity, and gender-related pitch differences, which affect authentication accuracy. This paper introduces a Gender-Aware Siamese-Triplet Network-Deep Neural Network (ST-DNN) architecture to address these challenges. The Gender-Aware Network utilizes Convolutional 2D layers with ReLU activation for initial feature extraction, followed by multi-fusion dense skip connections and batch normalization to integrate features across different depths, enhancing discrimination between male and female speakers. A bottleneck layer compresses feature maps to capture gender-related characteristics effectively. For enhanced speaker verification, separate male and female ST-DNN models are used, each incorporating Individual, Siamese, and Triplet Networks. The Individual Network extracts unique utterance characteristics, the Siamese Network compares speech sample pairs for speaker identity, and the Triplet Network ensures closely grouped embeddings of samples from the same speaker, facilitating precise verification. Experimental results on RSR2015 and RedDots Challenge 2016 datasets demonstrate significant improvements, with reductions in Equal Error Rate (EER) ranging from 32.31% to 54.55% for males and 33.73% to 38.98% for females, and reductions in MinDCF from 53.47% to 86.36% and 39.46% to 71.19%, respectively, validating the efficacy of the ST-DNN in real-world applications.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Network-Computation in Neural Systems
Network-Computation in Neural Systems 工程技术-工程:电子与电气
CiteScore
3.70
自引率
1.30%
发文量
22
审稿时长
>12 weeks
期刊介绍: Network: Computation in Neural Systems welcomes submissions of research papers that integrate theoretical neuroscience with experimental data, emphasizing the utilization of cutting-edge technologies. We invite authors and researchers to contribute their work in the following areas: Theoretical Neuroscience: This section encompasses neural network modeling approaches that elucidate brain function. Neural Networks in Data Analysis and Pattern Recognition: We encourage submissions exploring the use of neural networks for data analysis and pattern recognition, including but not limited to image analysis and speech processing applications. Neural Networks in Control Systems: This category encompasses the utilization of neural networks in control systems, including robotics, state estimation, fault detection, and diagnosis. Analysis of Neurophysiological Data: We invite submissions focusing on the analysis of neurophysiology data obtained from experimental studies involving animals. Analysis of Experimental Data on the Human Brain: This section includes papers analyzing experimental data from studies on the human brain, utilizing imaging techniques such as MRI, fMRI, EEG, and PET. Neurobiological Foundations of Consciousness: We encourage submissions exploring the neural bases of consciousness in the brain and its simulation in machines.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信