Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Ankur T. Patil, R. Acharya, H. Patil
{"title":"Speech Demodulation-based Techniques for Replay and Presentation Attack Detection","authors":"Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Ankur T. Patil, R. Acharya, H. Patil","doi":"10.1109/APSIPAASC47483.2019.9023046","DOIUrl":null,"url":null,"abstract":"Spoofing is one of the threats that bypass the voice biometrics and gains the access to the system. In particular, Automatic Speaker Verification (ASV) system is vulnerable to various kinds of spoofing attacks. This paper is an extension of our earlier work, the combination of different speech demodulation techniques, such as Hilbert Transform (HT), Energy Separation Algorithm (ESA), and its Variable length version (VESA) is investigated for replay Spoof Speech Detection (SSD) task. In particular, the feature sets are developed using Instantaneous Amplitude and Instantaneous Frequency (IA-IF) components of narrowband filtered speech signals obtained from linearly-spaced Gabor filterbank. We observed relative effectiveness of these demodulation techniques on two spoof speech databases, i.e., BTAS 2016 and ASVspoof 2017 version 2.0 challenge database that focus on the presentation and replay attacks, respectively. The results obtained from different demodulation techniques gave comparable results on both databases showing small variations in % Equal Error Rate (EER). For VESA, we found that with Dependency Index (DI) = 2 gave relatively better performance compared to the other DI on both the databases for SSD task. All the demodulation technique-based feature sets gave lower % EER than their baseline system for both the databases.","PeriodicalId":145222,"journal":{"name":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIPAASC47483.2019.9023046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Spoofing is one of the threats that bypass the voice biometrics and gains the access to the system. In particular, Automatic Speaker Verification (ASV) system is vulnerable to various kinds of spoofing attacks. This paper is an extension of our earlier work, the combination of different speech demodulation techniques, such as Hilbert Transform (HT), Energy Separation Algorithm (ESA), and its Variable length version (VESA) is investigated for replay Spoof Speech Detection (SSD) task. In particular, the feature sets are developed using Instantaneous Amplitude and Instantaneous Frequency (IA-IF) components of narrowband filtered speech signals obtained from linearly-spaced Gabor filterbank. We observed relative effectiveness of these demodulation techniques on two spoof speech databases, i.e., BTAS 2016 and ASVspoof 2017 version 2.0 challenge database that focus on the presentation and replay attacks, respectively. The results obtained from different demodulation techniques gave comparable results on both databases showing small variations in % Equal Error Rate (EER). For VESA, we found that with Dependency Index (DI) = 2 gave relatively better performance compared to the other DI on both the databases for SSD task. All the demodulation technique-based feature sets gave lower % EER than their baseline system for both the databases.