{"title":"SpecAugment对自动说话人验证系统的影响","authors":"M. Faisal, S. Suyanto","doi":"10.1109/ISRITI48646.2019.9034603","DOIUrl":null,"url":null,"abstract":"An automatic speaker verification (ASV) is one of the challenging problem in speech processing since there are so many models of machine learnings those capable of synthesizing a fake speech from a given text. This paper discusses the impact of SpecAugment to methods such as Gaussian Mixture Models (GMM) and Deep Neural Networks (DNNs). Some experiments on a speech dataset sampled from the ASVSpoof2019, which is specially made to tackle the threat of spoofing, show that DNNs produces an Equal Error Rate (EER) of 18.1% that is better than the GMM system with EER of 19.0%. And after combining with a traditional augmentation technique, the DNNs also gives a better EER of 15.3% than GMM with EER of 15.7%.","PeriodicalId":367363,"journal":{"name":"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"SpecAugment Impact on Automatic Speaker Verification System\",\"authors\":\"M. Faisal, S. Suyanto\",\"doi\":\"10.1109/ISRITI48646.2019.9034603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An automatic speaker verification (ASV) is one of the challenging problem in speech processing since there are so many models of machine learnings those capable of synthesizing a fake speech from a given text. This paper discusses the impact of SpecAugment to methods such as Gaussian Mixture Models (GMM) and Deep Neural Networks (DNNs). Some experiments on a speech dataset sampled from the ASVSpoof2019, which is specially made to tackle the threat of spoofing, show that DNNs produces an Equal Error Rate (EER) of 18.1% that is better than the GMM system with EER of 19.0%. And after combining with a traditional augmentation technique, the DNNs also gives a better EER of 15.3% than GMM with EER of 15.7%.\",\"PeriodicalId\":367363,\"journal\":{\"name\":\"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISRITI48646.2019.9034603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI48646.2019.9034603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SpecAugment Impact on Automatic Speaker Verification System
An automatic speaker verification (ASV) is one of the challenging problem in speech processing since there are so many models of machine learnings those capable of synthesizing a fake speech from a given text. This paper discusses the impact of SpecAugment to methods such as Gaussian Mixture Models (GMM) and Deep Neural Networks (DNNs). Some experiments on a speech dataset sampled from the ASVSpoof2019, which is specially made to tackle the threat of spoofing, show that DNNs produces an Equal Error Rate (EER) of 18.1% that is better than the GMM system with EER of 19.0%. And after combining with a traditional augmentation technique, the DNNs also gives a better EER of 15.3% than GMM with EER of 15.7%.