{"title":"A multi-branch ResNet with discriminative features for detection of replay speech signals","authors":"Xingliang Cheng, Mingxing Xu, T. Zheng","doi":"10.1017/ATSIP.2020.26","DOIUrl":null,"url":null,"abstract":"Nowadays, the security of ASV systems is increasingly gaining attention. As one of the common spoofing methods, replay attacks are easy to implement but difficult to detect. Many researchers focus on designing various features to detect the distortion of replay attack attempts. Constant-Q cepstral coefficients (CQCC), based on the magnitude of the constant-Q transform (CQT), is one of the striking features in the field of replay detection. However, it ignores phase information, which may also be distorted in the replay processes. In this work, we propose a CQT-based modified group delay feature (CQTMGD) which can capture the phase information of CQT. Furthermore, a multi-branch residual convolution network, ResNeWt, is proposed to distinguish replay attacks from bonafide attempts. We evaluated our proposal in the ASVspoof 2019 physical access dataset. Results show that CQTMGD outperformed the traditional MGD feature, and the fusion with other magnitude-based and phase-based features achieved a further improvement. Our best fusion system achieved 0.0096 min-tDCF and 0.39% EER on the evaluation set and it outperformed all the other state-of-the-art methods in the ASVspoof 2019 physical access challenge.","PeriodicalId":44812,"journal":{"name":"APSIPA Transactions on Signal and Information Processing","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2020-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/ATSIP.2020.26","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"APSIPA Transactions on Signal and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/ATSIP.2020.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 3
Abstract
Nowadays, the security of ASV systems is increasingly gaining attention. As one of the common spoofing methods, replay attacks are easy to implement but difficult to detect. Many researchers focus on designing various features to detect the distortion of replay attack attempts. Constant-Q cepstral coefficients (CQCC), based on the magnitude of the constant-Q transform (CQT), is one of the striking features in the field of replay detection. However, it ignores phase information, which may also be distorted in the replay processes. In this work, we propose a CQT-based modified group delay feature (CQTMGD) which can capture the phase information of CQT. Furthermore, a multi-branch residual convolution network, ResNeWt, is proposed to distinguish replay attacks from bonafide attempts. We evaluated our proposal in the ASVspoof 2019 physical access dataset. Results show that CQTMGD outperformed the traditional MGD feature, and the fusion with other magnitude-based and phase-based features achieved a further improvement. Our best fusion system achieved 0.0096 min-tDCF and 0.39% EER on the evaluation set and it outperformed all the other state-of-the-art methods in the ASVspoof 2019 physical access challenge.