{"title":"使用自适应gmm进行说话人分割","authors":"Mohamed Lazhar Bellagha, M. Labidi, M. Maraoui","doi":"10.1109/ICEMIS.2017.8273020","DOIUrl":null,"url":null,"abstract":"The number of speakers is generally greater in broadcast news, therefore speakers frequently occur under varying acoustic conditions. This makes the audio stream contain many speaker turns. Given the small amount of data available (short speech segments), segmentation techniques are not known for their robustness. To address the problems associated with the high number of speaker interventions, we propose an unsupervised speaker segmentation approach that has the advantage of high accuracy of model-selection-based methods. In this approach we use the adapted Gaussian mixture models (GMM) to describe the speaker's characteristics. Experimental results on several journalistic programs show that the proposed method significantly improves segmentation and therefore reduces the diarization error.","PeriodicalId":117908,"journal":{"name":"2017 International Conference on Engineering & MIS (ICEMIS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Speaker segmentation using adapted GMMs\",\"authors\":\"Mohamed Lazhar Bellagha, M. Labidi, M. Maraoui\",\"doi\":\"10.1109/ICEMIS.2017.8273020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of speakers is generally greater in broadcast news, therefore speakers frequently occur under varying acoustic conditions. This makes the audio stream contain many speaker turns. Given the small amount of data available (short speech segments), segmentation techniques are not known for their robustness. To address the problems associated with the high number of speaker interventions, we propose an unsupervised speaker segmentation approach that has the advantage of high accuracy of model-selection-based methods. In this approach we use the adapted Gaussian mixture models (GMM) to describe the speaker's characteristics. Experimental results on several journalistic programs show that the proposed method significantly improves segmentation and therefore reduces the diarization error.\",\"PeriodicalId\":117908,\"journal\":{\"name\":\"2017 International Conference on Engineering & MIS (ICEMIS)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Engineering & MIS (ICEMIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEMIS.2017.8273020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Engineering & MIS (ICEMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEMIS.2017.8273020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The number of speakers is generally greater in broadcast news, therefore speakers frequently occur under varying acoustic conditions. This makes the audio stream contain many speaker turns. Given the small amount of data available (short speech segments), segmentation techniques are not known for their robustness. To address the problems associated with the high number of speaker interventions, we propose an unsupervised speaker segmentation approach that has the advantage of high accuracy of model-selection-based methods. In this approach we use the adapted Gaussian mixture models (GMM) to describe the speaker's characteristics. Experimental results on several journalistic programs show that the proposed method significantly improves segmentation and therefore reduces the diarization error.