基于深度神经网络分类的重放欺骗攻击检测

Salahaldeen Duraibi, Wasim Alhamdani, Frederick T. Sheldon
{"title":"基于深度神经网络分类的重放欺骗攻击检测","authors":"Salahaldeen Duraibi, Wasim Alhamdani, Frederick T. Sheldon","doi":"10.1109/CSCI51800.2020.00036","DOIUrl":null,"url":null,"abstract":"In this paper, we explore the use of the deep learning approach for replay spoof detection in speaker verification systems. Automatic speaker verifications (ASVs) can be easily spoofed by previously recorded genuine speech. In order to counter the issues of spoofing, detecting spoofing attacks play an important role. Hence, we consider the detection of replay attack spoofing that is the most easily accomplished spoofing attack. In this light, we propose a deep neural network-based (DNN) classifier using a hybrid feature from Mel-frequency cepstral coefficient (MFCC) and constant Q cepstral coefficient (CQCC). Several experiments were conducted on the latest version of ASVspoof 2017 dataset. The results are compared with a base line system that uses the Gaussian mixture model (GMM) classifier with different features that include MFCC, CQCC, and the hybrid feature of the two. The experiment results reveal that the DNN classifier outperforms the conventional GMM classifier. It was found that the hybrid-based features are superior to single features, such as CQCC and MFCC in terms of equal error rate (ERR). In addition, like many previous researchers have found, it turned out that high-frequency regions of speech utterance convey much more discriminative information for replay attack detection.","PeriodicalId":336929,"journal":{"name":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Replay Spoof Attack Detection using Deep Neural Networks for Classification\",\"authors\":\"Salahaldeen Duraibi, Wasim Alhamdani, Frederick T. Sheldon\",\"doi\":\"10.1109/CSCI51800.2020.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we explore the use of the deep learning approach for replay spoof detection in speaker verification systems. Automatic speaker verifications (ASVs) can be easily spoofed by previously recorded genuine speech. In order to counter the issues of spoofing, detecting spoofing attacks play an important role. Hence, we consider the detection of replay attack spoofing that is the most easily accomplished spoofing attack. In this light, we propose a deep neural network-based (DNN) classifier using a hybrid feature from Mel-frequency cepstral coefficient (MFCC) and constant Q cepstral coefficient (CQCC). Several experiments were conducted on the latest version of ASVspoof 2017 dataset. The results are compared with a base line system that uses the Gaussian mixture model (GMM) classifier with different features that include MFCC, CQCC, and the hybrid feature of the two. The experiment results reveal that the DNN classifier outperforms the conventional GMM classifier. It was found that the hybrid-based features are superior to single features, such as CQCC and MFCC in terms of equal error rate (ERR). In addition, like many previous researchers have found, it turned out that high-frequency regions of speech utterance convey much more discriminative information for replay attack detection.\",\"PeriodicalId\":336929,\"journal\":{\"name\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Computational Science and Computational Intelligence (CSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSCI51800.2020.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computational Science and Computational Intelligence (CSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCI51800.2020.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

在本文中,我们探索了在说话人验证系统中使用深度学习方法进行重放欺骗检测。自动说话者验证(asv)很容易被先前录制的真实语音欺骗。为了对抗欺骗攻击的问题,检测欺骗攻击起着重要的作用。因此,我们认为检测重放攻击欺骗是最容易实现的欺骗攻击。鉴于此,我们提出了一种基于深度神经网络(DNN)的分类器,该分类器使用mel频率倒谱系数(MFCC)和常数Q倒谱系数(CQCC)的混合特征。在最新版本的ASVspoof 2017数据集上进行了多次实验。将结果与使用高斯混合模型(GMM)分类器的基线系统进行比较,该分类器具有不同的特征,包括MFCC, CQCC以及两者的混合特征。实验结果表明,DNN分类器优于传统的GMM分类器。结果表明,混合特征在等错误率(ERR)方面优于CQCC和MFCC等单一特征。此外,就像许多先前的研究人员发现的那样,事实证明语音的高频区域传递了更多的重放攻击检测的判别信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Replay Spoof Attack Detection using Deep Neural Networks for Classification
In this paper, we explore the use of the deep learning approach for replay spoof detection in speaker verification systems. Automatic speaker verifications (ASVs) can be easily spoofed by previously recorded genuine speech. In order to counter the issues of spoofing, detecting spoofing attacks play an important role. Hence, we consider the detection of replay attack spoofing that is the most easily accomplished spoofing attack. In this light, we propose a deep neural network-based (DNN) classifier using a hybrid feature from Mel-frequency cepstral coefficient (MFCC) and constant Q cepstral coefficient (CQCC). Several experiments were conducted on the latest version of ASVspoof 2017 dataset. The results are compared with a base line system that uses the Gaussian mixture model (GMM) classifier with different features that include MFCC, CQCC, and the hybrid feature of the two. The experiment results reveal that the DNN classifier outperforms the conventional GMM classifier. It was found that the hybrid-based features are superior to single features, such as CQCC and MFCC in terms of equal error rate (ERR). In addition, like many previous researchers have found, it turned out that high-frequency regions of speech utterance convey much more discriminative information for replay attack detection.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信