Ashutosh Pandey, Rohan Kumar Das, Nagaraj Adiga, Naresh Gupta, S. R. Mahadeva Prasanna
{"title":"声门活动检测对退化和有限数据条件下说话人验证的意义","authors":"Ashutosh Pandey, Rohan Kumar Das, Nagaraj Adiga, Naresh Gupta, S. R. Mahadeva Prasanna","doi":"10.1109/TENCON.2015.7372916","DOIUrl":null,"url":null,"abstract":"The objective of this work is to establish the importance of speaker information present in the glottal regions of speech signal. In addition, its robustness for degraded data and significance for limited data is sought for the task of speaker verification. An adaptive threshold method is proposed to use on zero frequency filtered signal to get the glottal activity regions. Feature vectors are extracted from regions having significant glottal activity. An i-vector based speaker verification system is developed using NIST SRE 2003 database and the performance of proposed method is evaluated in degraded and limited data condition. Robustness of proposed method is tested for white and babble noise. Further, short utterances of test data are considered to evaluate the performance in limited data condition. The proposed method based on the selection of glottal regions is found to perform better than the baseline energy based voice activity detection method in degraded and limited data conditions.","PeriodicalId":22200,"journal":{"name":"TENCON 2015 - 2015 IEEE Region 10 Conference","volume":"60 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Significance of glottal activity detection for speaker verification in degraded and limited data condition\",\"authors\":\"Ashutosh Pandey, Rohan Kumar Das, Nagaraj Adiga, Naresh Gupta, S. R. Mahadeva Prasanna\",\"doi\":\"10.1109/TENCON.2015.7372916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The objective of this work is to establish the importance of speaker information present in the glottal regions of speech signal. In addition, its robustness for degraded data and significance for limited data is sought for the task of speaker verification. An adaptive threshold method is proposed to use on zero frequency filtered signal to get the glottal activity regions. Feature vectors are extracted from regions having significant glottal activity. An i-vector based speaker verification system is developed using NIST SRE 2003 database and the performance of proposed method is evaluated in degraded and limited data condition. Robustness of proposed method is tested for white and babble noise. Further, short utterances of test data are considered to evaluate the performance in limited data condition. The proposed method based on the selection of glottal regions is found to perform better than the baseline energy based voice activity detection method in degraded and limited data conditions.\",\"PeriodicalId\":22200,\"journal\":{\"name\":\"TENCON 2015 - 2015 IEEE Region 10 Conference\",\"volume\":\"60 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"TENCON 2015 - 2015 IEEE Region 10 Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TENCON.2015.7372916\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"TENCON 2015 - 2015 IEEE Region 10 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2015.7372916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Significance of glottal activity detection for speaker verification in degraded and limited data condition
The objective of this work is to establish the importance of speaker information present in the glottal regions of speech signal. In addition, its robustness for degraded data and significance for limited data is sought for the task of speaker verification. An adaptive threshold method is proposed to use on zero frequency filtered signal to get the glottal activity regions. Feature vectors are extracted from regions having significant glottal activity. An i-vector based speaker verification system is developed using NIST SRE 2003 database and the performance of proposed method is evaluated in degraded and limited data condition. Robustness of proposed method is tested for white and babble noise. Further, short utterances of test data are considered to evaluate the performance in limited data condition. The proposed method based on the selection of glottal regions is found to perform better than the baseline energy based voice activity detection method in degraded and limited data conditions.