{"title":"基于基音估计的说话人识别系统","authors":"Makrem Ben Jdira, Imen Jemaa, K. Ouni","doi":"10.1109/CISTEM.2014.7076752","DOIUrl":null,"url":null,"abstract":"Automatic speaker recognition is to identify an individual from the audio recording of his voice. Several techniques exist in the current state of the art of the discipline. We designed a new technique comparable to those existing using the frequency of vibration of the vocal cords called the fundamental frequency. It is highly dependent on physiological characteristics of the individual. It is remarkably different from one person to another. We studied the existing techniques for estimating pitch and we chose the YAAPT technique (Yet Another Algorithm for Pitch Detection). Then we calculated the probability distribution of occurrence of each value of F0 in the speech signal to each speaker, and we modeled it by a Gaussian mixture (GMM). By testing our technique in text-independent mode and comparing it with other existing techniques in the literature, we noticed its performance.","PeriodicalId":115632,"journal":{"name":"2014 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM)","volume":"29 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Speaker recognition system based on pitch estimation\",\"authors\":\"Makrem Ben Jdira, Imen Jemaa, K. Ouni\",\"doi\":\"10.1109/CISTEM.2014.7076752\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speaker recognition is to identify an individual from the audio recording of his voice. Several techniques exist in the current state of the art of the discipline. We designed a new technique comparable to those existing using the frequency of vibration of the vocal cords called the fundamental frequency. It is highly dependent on physiological characteristics of the individual. It is remarkably different from one person to another. We studied the existing techniques for estimating pitch and we chose the YAAPT technique (Yet Another Algorithm for Pitch Detection). Then we calculated the probability distribution of occurrence of each value of F0 in the speech signal to each speaker, and we modeled it by a Gaussian mixture (GMM). By testing our technique in text-independent mode and comparing it with other existing techniques in the literature, we noticed its performance.\",\"PeriodicalId\":115632,\"journal\":{\"name\":\"2014 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM)\",\"volume\":\"29 12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CISTEM.2014.7076752\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISTEM.2014.7076752","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
摘要
自动说话人识别是通过录音来识别一个人的声音。在该学科目前的艺术状态中存在着几种技术。我们设计了一种新技术,与现有的技术相媲美,使用声带振动的频率,称为基频。它高度依赖于个体的生理特征。这因人而异。我们研究了现有的估计音高的技术,我们选择了YAAPT技术(Yet Another Algorithm for pitch Detection)。然后计算每个说话人的语音信号中每个F0值出现的概率分布,并通过高斯混合(GMM)对其建模。通过在文本无关模式下测试我们的技术,并将其与文献中其他现有技术进行比较,我们注意到它的性能。
Speaker recognition system based on pitch estimation
Automatic speaker recognition is to identify an individual from the audio recording of his voice. Several techniques exist in the current state of the art of the discipline. We designed a new technique comparable to those existing using the frequency of vibration of the vocal cords called the fundamental frequency. It is highly dependent on physiological characteristics of the individual. It is remarkably different from one person to another. We studied the existing techniques for estimating pitch and we chose the YAAPT technique (Yet Another Algorithm for Pitch Detection). Then we calculated the probability distribution of occurrence of each value of F0 in the speech signal to each speaker, and we modeled it by a Gaussian mixture (GMM). By testing our technique in text-independent mode and comparing it with other existing techniques in the literature, we noticed its performance.