Speech Demodulation-based Techniques for Replay and Presentation Attack Detection

Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Ankur T. Patil, R. Acharya, H. Patil
{"title":"Speech Demodulation-based Techniques for Replay and Presentation Attack Detection","authors":"Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Ankur T. Patil, R. Acharya, H. Patil","doi":"10.1109/APSIPAASC47483.2019.9023046","DOIUrl":null,"url":null,"abstract":"Spoofing is one of the threats that bypass the voice biometrics and gains the access to the system. In particular, Automatic Speaker Verification (ASV) system is vulnerable to various kinds of spoofing attacks. This paper is an extension of our earlier work, the combination of different speech demodulation techniques, such as Hilbert Transform (HT), Energy Separation Algorithm (ESA), and its Variable length version (VESA) is investigated for replay Spoof Speech Detection (SSD) task. In particular, the feature sets are developed using Instantaneous Amplitude and Instantaneous Frequency (IA-IF) components of narrowband filtered speech signals obtained from linearly-spaced Gabor filterbank. We observed relative effectiveness of these demodulation techniques on two spoof speech databases, i.e., BTAS 2016 and ASVspoof 2017 version 2.0 challenge database that focus on the presentation and replay attacks, respectively. The results obtained from different demodulation techniques gave comparable results on both databases showing small variations in % Equal Error Rate (EER). For VESA, we found that with Dependency Index (DI) = 2 gave relatively better performance compared to the other DI on both the databases for SSD task. All the demodulation technique-based feature sets gave lower % EER than their baseline system for both the databases.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPAASC47483.2019.9023046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Spoofing is one of the threats that bypass the voice biometrics and gains the access to the system. In particular, Automatic Speaker Verification (ASV) system is vulnerable to various kinds of spoofing attacks. This paper is an extension of our earlier work, the combination of different speech demodulation techniques, such as Hilbert Transform (HT), Energy Separation Algorithm (ESA), and its Variable length version (VESA) is investigated for replay Spoof Speech Detection (SSD) task. In particular, the feature sets are developed using Instantaneous Amplitude and Instantaneous Frequency (IA-IF) components of narrowband filtered speech signals obtained from linearly-spaced Gabor filterbank. We observed relative effectiveness of these demodulation techniques on two spoof speech databases, i.e., BTAS 2016 and ASVspoof 2017 version 2.0 challenge database that focus on the presentation and replay attacks, respectively. The results obtained from different demodulation techniques gave comparable results on both databases showing small variations in % Equal Error Rate (EER). For VESA, we found that with Dependency Index (DI) = 2 gave relatively better performance compared to the other DI on both the databases for SSD task. All the demodulation technique-based feature sets gave lower % EER than their baseline system for both the databases.
基于语音解调的重放和表示攻击检测技术
欺骗是绕过语音生物识别并获得系统访问权限的威胁之一。特别是自动说话人验证(ASV)系统容易受到各种欺骗攻击。本文是我们早期工作的延伸,研究了不同的语音解调技术,如希尔伯特变换(HT)、能量分离算法(ESA)及其变长版本(VESA)的组合,用于重放欺骗语音检测(SSD)任务。特别是,使用从线性间隔Gabor滤波器组获得的窄带滤波语音信号的瞬时幅度和瞬时频率(IA-IF)分量来开发特征集。我们在两个欺骗语音数据库上观察了这些解调技术的相对有效性,即BTAS 2016和ASVspoof 2017 2.0版本挑战数据库,分别专注于呈现和重播攻击。从不同的解调技术获得的结果在两个数据库上给出了可比较的结果,显示出%相等错误率(EER)的微小变化。对于VESA,我们发现依赖指数(DI) = 2相对于其他DI在两个数据库上的SSD任务提供了更好的性能。所有基于解调技术的特征集在两个数据库中给出的EER都低于基线系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信