Development of Voice Spoofing Detection Systems for 2019 Edition of Automatic Speaker Verification and Countermeasures Challenge

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2019-12-01 DOI:10.1109/ASRU46091.2019.9003792

João Monteiro, Md. Jahangir Alam

{"title":"Development of Voice Spoofing Detection Systems for 2019 Edition of Automatic Speaker Verification and Countermeasures Challenge","authors":"João Monteiro, Md. Jahangir Alam","doi":"10.1109/ASRU46091.2019.9003792","DOIUrl":null,"url":null,"abstract":"A robust speaker verification system is expected to provide high recognition accuracy not only in adverse environments but also in the presence of spoofing attacks, which renders voice spoofing detection as crucial to prevent automatic speaker verification systems from a security breach. In this work, we present anti-spoofing systems developed for tackling spoofing attacks introduced for the ASVspoof 2019 challenge. We employ frame-level descriptors such as discrete Fourier transform, as well as constant Q transform-based spectral and cepstral features as countermeasures. These descriptors are both used on their own with a spoofing detection classifier to detect spoofing attacks, or in tandem with deep bottleneck features, i.e. approximate posteriors parametrized by a neural network designed to discriminate between bonafide and spoof signals. Fisher vector encoding and i-vector representations are further learned from the frame-level descriptors of the signals. For modeling, we employ two classification strategies. We finally build an end-to-end anti-spoofing system by making use of modified versions of light convolution neural networks as well as well-known ResNets. Our primary system for the logical access task and a single end-to-end system for the case of physical access we attain significant improvements over two baseline systems.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

A robust speaker verification system is expected to provide high recognition accuracy not only in adverse environments but also in the presence of spoofing attacks, which renders voice spoofing detection as crucial to prevent automatic speaker verification systems from a security breach. In this work, we present anti-spoofing systems developed for tackling spoofing attacks introduced for the ASVspoof 2019 challenge. We employ frame-level descriptors such as discrete Fourier transform, as well as constant Q transform-based spectral and cepstral features as countermeasures. These descriptors are both used on their own with a spoofing detection classifier to detect spoofing attacks, or in tandem with deep bottleneck features, i.e. approximate posteriors parametrized by a neural network designed to discriminate between bonafide and spoof signals. Fisher vector encoding and i-vector representations are further learned from the frame-level descriptors of the signals. For modeling, we employ two classification strategies. We finally build an end-to-end anti-spoofing system by making use of modified versions of light convolution neural networks as well as well-known ResNets. Our primary system for the logical access task and a single end-to-end system for the case of physical access we attain significant improvements over two baseline systems.

查看原文本刊更多论文

2019版自动说话者验证与对抗挑战赛语音欺骗检测系统的开发

一个强大的说话人验证系统不仅要在恶劣的环境中，而且要在存在欺骗攻击的情况下提供高的识别精度，这使得语音欺骗检测对于防止自动说话人验证系统的安全漏洞至关重要。在这项工作中，我们介绍了为解决ASVspoof 2019挑战引入的欺骗攻击而开发的反欺骗系统。我们采用帧级描述符，如离散傅里叶变换，以及基于常数Q变换的频谱和倒谱特征作为对策。这些描述符都与欺骗检测分类器单独使用以检测欺骗攻击，或与深度瓶颈特征一起使用，即由神经网络参数化的近似后验，旨在区分真实信号和欺骗信号。从信号的帧级描述符进一步学习Fisher矢量编码和i矢量表示。为了建模，我们采用了两种分类策略。最后，我们利用改进版本的光卷积神经网络和著名的ResNets构建了一个端到端的反欺骗系统。我们用于逻辑访问任务的主要系统和用于物理访问的单个端到端系统在两个基线系统上取得了重大改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量