{"title":"Speaker segmentation using adapted GMMs","authors":"Mohamed Lazhar Bellagha, M. Labidi, M. Maraoui","doi":"10.1109/ICEMIS.2017.8273020","DOIUrl":null,"url":null,"abstract":"The number of speakers is generally greater in broadcast news, therefore speakers frequently occur under varying acoustic conditions. This makes the audio stream contain many speaker turns. Given the small amount of data available (short speech segments), segmentation techniques are not known for their robustness. To address the problems associated with the high number of speaker interventions, we propose an unsupervised speaker segmentation approach that has the advantage of high accuracy of model-selection-based methods. In this approach we use the adapted Gaussian mixture models (GMM) to describe the speaker's characteristics. Experimental results on several journalistic programs show that the proposed method significantly improves segmentation and therefore reduces the diarization error.","PeriodicalId":117908,"journal":{"name":"2017 International Conference on Engineering & MIS (ICEMIS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Engineering & MIS (ICEMIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEMIS.2017.8273020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The number of speakers is generally greater in broadcast news, therefore speakers frequently occur under varying acoustic conditions. This makes the audio stream contain many speaker turns. Given the small amount of data available (short speech segments), segmentation techniques are not known for their robustness. To address the problems associated with the high number of speaker interventions, we propose an unsupervised speaker segmentation approach that has the advantage of high accuracy of model-selection-based methods. In this approach we use the adapted Gaussian mixture models (GMM) to describe the speaker's characteristics. Experimental results on several journalistic programs show that the proposed method significantly improves segmentation and therefore reduces the diarization error.