{"title":"基于谱峰跟踪分析的韩语广播新闻说话人变化检测","authors":"Ji-Soo Keum, Hyon-Soo Lee","doi":"10.1109/ICICS.2005.1689143","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a new speaker change detection algorithm based on spectral peak track analysis. When we pronounce the Korean sentence or words, it has a rhythm and intonation pattern varies according to the kinds of sentence and speaking style. Spectral peak track has relevance to this information and audio data type. Therefore, we use spectral peak track information as a speaker feature for change detection. The proposed method consists of three parts: data segmentation, feature generation based on spectral peak tracks analysis and change detection. We assume that the changing point exists at the breathing point, because these locations rely on sentence duration, speaking speed and style, and grammar. To evaluate the proposed method, we calculated the precision (PRC) and recall (RCL) for Korean broadcast news and compared these results with the BIC method for randomly selected segments. Experiment result, the PRC is 73.14% and the RCL is 85.46% for Korean broadcast news, and we have achieved a performance comparable to BIC","PeriodicalId":425178,"journal":{"name":"2005 5th International Conference on Information Communications & Signal Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Speaker Change Detection Based on Spectral Peak Track Analysis for Korean Broadcast News\",\"authors\":\"Ji-Soo Keum, Hyon-Soo Lee\",\"doi\":\"10.1109/ICICS.2005.1689143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a new speaker change detection algorithm based on spectral peak track analysis. When we pronounce the Korean sentence or words, it has a rhythm and intonation pattern varies according to the kinds of sentence and speaking style. Spectral peak track has relevance to this information and audio data type. Therefore, we use spectral peak track information as a speaker feature for change detection. The proposed method consists of three parts: data segmentation, feature generation based on spectral peak tracks analysis and change detection. We assume that the changing point exists at the breathing point, because these locations rely on sentence duration, speaking speed and style, and grammar. To evaluate the proposed method, we calculated the precision (PRC) and recall (RCL) for Korean broadcast news and compared these results with the BIC method for randomly selected segments. Experiment result, the PRC is 73.14% and the RCL is 85.46% for Korean broadcast news, and we have achieved a performance comparable to BIC\",\"PeriodicalId\":425178,\"journal\":{\"name\":\"2005 5th International Conference on Information Communications & Signal Processing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 5th International Conference on Information Communications & Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICS.2005.1689143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 5th International Conference on Information Communications & Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS.2005.1689143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speaker Change Detection Based on Spectral Peak Track Analysis for Korean Broadcast News
In this paper, we propose a new speaker change detection algorithm based on spectral peak track analysis. When we pronounce the Korean sentence or words, it has a rhythm and intonation pattern varies according to the kinds of sentence and speaking style. Spectral peak track has relevance to this information and audio data type. Therefore, we use spectral peak track information as a speaker feature for change detection. The proposed method consists of three parts: data segmentation, feature generation based on spectral peak tracks analysis and change detection. We assume that the changing point exists at the breathing point, because these locations rely on sentence duration, speaking speed and style, and grammar. To evaluate the proposed method, we calculated the precision (PRC) and recall (RCL) for Korean broadcast news and compared these results with the BIC method for randomly selected segments. Experiment result, the PRC is 73.14% and the RCL is 85.46% for Korean broadcast news, and we have achieved a performance comparable to BIC