{"title":"Text-Independent Speaker Recognition Based on Syllabic Pitch Contour Parameters","authors":"M. Ahmed, Z. Bawar","doi":"10.1145/3234698.3234711","DOIUrl":null,"url":null,"abstract":"This paper propose a new approach to extract and represent the temporal information of pitch useful for text-independent speaker recognition in real-time. Pitch is relatively less affected by channel variations and noisy environment. With the increasing number of individual speakers, the pitch values as whole decreases its distinction. In order to identify the speaker, the modified energy based approach has been used to segment the continuous speech into syllable-like units automatically. Those syllable segments are being used to estimate the temporal pitch variation such as rise, fall, slope, peak and rate of pitch change; specific to a particular speaker and less affected by noisy environment. Experimental results on SITW corpus show that the recognition rate of proposed method using the parameters calculated on syllabic pitch contour improves the accuracy by 15% as compared to average pitch and full pitch contour parameters. Overall accuracy increase 30% when MFCC--syllabic Pitch features combined.","PeriodicalId":144334,"journal":{"name":"Proceedings of the Fourth International Conference on Engineering & MIS 2018","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Engineering & MIS 2018","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3234698.3234711","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper propose a new approach to extract and represent the temporal information of pitch useful for text-independent speaker recognition in real-time. Pitch is relatively less affected by channel variations and noisy environment. With the increasing number of individual speakers, the pitch values as whole decreases its distinction. In order to identify the speaker, the modified energy based approach has been used to segment the continuous speech into syllable-like units automatically. Those syllable segments are being used to estimate the temporal pitch variation such as rise, fall, slope, peak and rate of pitch change; specific to a particular speaker and less affected by noisy environment. Experimental results on SITW corpus show that the recognition rate of proposed method using the parameters calculated on syllabic pitch contour improves the accuracy by 15% as compared to average pitch and full pitch contour parameters. Overall accuracy increase 30% when MFCC--syllabic Pitch features combined.