{"title":"基于元音分类的短话语说话人识别","authors":"N. Fatima, T. Zheng","doi":"10.1109/ICSAI.2012.6223387","DOIUrl":null,"url":null,"abstract":"The impact of Short Utterances in Speaker Recognition is of significant importance. Despite the advancements in short utterance speaker recognition (SUSR), text dependence and the role of phonemes in carrying speaker information needs further investigation. This paper presents a novel method of using vowel categories for SUSR. We define Vowel Categories (VC's) considering Chinese and English languages. After recognition and extraction of phonemes, the obtained vowels are divided into VC's, which are then used to develop Universal Background VC Models (UBVCM) for each VC. Conventional GMM-UBM system is used for training and testing. The proposed categories give minimum EERs of 13.76%, 14.03% and 16.18% for 3, 2 and 1 second respectively. Experimental results show that in text dependent SUSR, significant speaker-specific information is present at phoneme level. The similar properties of phonemes can be used such that accurate speech recognition is not required, rather Phoneme Categories can be used effectively for SUSR. Also, it is shown that vowels contain large amount of speaker information, which remains undisturbed when VC are employed.","PeriodicalId":164945,"journal":{"name":"2012 International Conference on Systems and Informatics (ICSAI2012)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Vowel-category based Short Utterance Speaker Recognition\",\"authors\":\"N. Fatima, T. Zheng\",\"doi\":\"10.1109/ICSAI.2012.6223387\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The impact of Short Utterances in Speaker Recognition is of significant importance. Despite the advancements in short utterance speaker recognition (SUSR), text dependence and the role of phonemes in carrying speaker information needs further investigation. This paper presents a novel method of using vowel categories for SUSR. We define Vowel Categories (VC's) considering Chinese and English languages. After recognition and extraction of phonemes, the obtained vowels are divided into VC's, which are then used to develop Universal Background VC Models (UBVCM) for each VC. Conventional GMM-UBM system is used for training and testing. The proposed categories give minimum EERs of 13.76%, 14.03% and 16.18% for 3, 2 and 1 second respectively. Experimental results show that in text dependent SUSR, significant speaker-specific information is present at phoneme level. The similar properties of phonemes can be used such that accurate speech recognition is not required, rather Phoneme Categories can be used effectively for SUSR. Also, it is shown that vowels contain large amount of speaker information, which remains undisturbed when VC are employed.\",\"PeriodicalId\":164945,\"journal\":{\"name\":\"2012 International Conference on Systems and Informatics (ICSAI2012)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 International Conference on Systems and Informatics (ICSAI2012)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSAI.2012.6223387\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Systems and Informatics (ICSAI2012)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2012.6223387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Vowel-category based Short Utterance Speaker Recognition
The impact of Short Utterances in Speaker Recognition is of significant importance. Despite the advancements in short utterance speaker recognition (SUSR), text dependence and the role of phonemes in carrying speaker information needs further investigation. This paper presents a novel method of using vowel categories for SUSR. We define Vowel Categories (VC's) considering Chinese and English languages. After recognition and extraction of phonemes, the obtained vowels are divided into VC's, which are then used to develop Universal Background VC Models (UBVCM) for each VC. Conventional GMM-UBM system is used for training and testing. The proposed categories give minimum EERs of 13.76%, 14.03% and 16.18% for 3, 2 and 1 second respectively. Experimental results show that in text dependent SUSR, significant speaker-specific information is present at phoneme level. The similar properties of phonemes can be used such that accurate speech recognition is not required, rather Phoneme Categories can be used effectively for SUSR. Also, it is shown that vowels contain large amount of speaker information, which remains undisturbed when VC are employed.