K. Sreenivasa Rao, Ketan Pachpande, R. R. Vempada, Sudhamay Maity
{"title":"基于说话人特定信息的电视广播新闻分割","authors":"K. Sreenivasa Rao, Ketan Pachpande, R. R. Vempada, Sudhamay Maity","doi":"10.1109/NCC.2012.6176848","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed two-stage segmentation approach for splitting the TV broadcast news bulletins into sequence of news stories. In the first stage, speaker (news reader) specific characteristics present in initial headlines of the news bulletin are used for gross level segmentation. During second stage, errors in the gross level segmentation (first stage) are corrected by exploiting the speaker specific information captured from the individual news stories other than headlines. During headlines the captured speaker specific information is mixed with background music, and hence the segmentation at the first stage may not be accurate. In this work speaker specific information is represented by using mel frequency cepstral coefficients (MFCCs), and it is captured by using Gaussian mixture models (GMMs). The proposed two-stage segmentation method is evaluated on manual segmented ten broadcast TV news bulletins. From the evaluation results, it is observed that about 93% of the news stories are correctly segmented, 7% are missed and 11% are spurious.","PeriodicalId":178278,"journal":{"name":"2012 National Conference on Communications (NCC)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Segmentation of TV broadcast news using speaker specific information\",\"authors\":\"K. Sreenivasa Rao, Ketan Pachpande, R. R. Vempada, Sudhamay Maity\",\"doi\":\"10.1109/NCC.2012.6176848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we proposed two-stage segmentation approach for splitting the TV broadcast news bulletins into sequence of news stories. In the first stage, speaker (news reader) specific characteristics present in initial headlines of the news bulletin are used for gross level segmentation. During second stage, errors in the gross level segmentation (first stage) are corrected by exploiting the speaker specific information captured from the individual news stories other than headlines. During headlines the captured speaker specific information is mixed with background music, and hence the segmentation at the first stage may not be accurate. In this work speaker specific information is represented by using mel frequency cepstral coefficients (MFCCs), and it is captured by using Gaussian mixture models (GMMs). The proposed two-stage segmentation method is evaluated on manual segmented ten broadcast TV news bulletins. From the evaluation results, it is observed that about 93% of the news stories are correctly segmented, 7% are missed and 11% are spurious.\",\"PeriodicalId\":178278,\"journal\":{\"name\":\"2012 National Conference on Communications (NCC)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC.2012.6176848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2012.6176848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Segmentation of TV broadcast news using speaker specific information
In this paper, we proposed two-stage segmentation approach for splitting the TV broadcast news bulletins into sequence of news stories. In the first stage, speaker (news reader) specific characteristics present in initial headlines of the news bulletin are used for gross level segmentation. During second stage, errors in the gross level segmentation (first stage) are corrected by exploiting the speaker specific information captured from the individual news stories other than headlines. During headlines the captured speaker specific information is mixed with background music, and hence the segmentation at the first stage may not be accurate. In this work speaker specific information is represented by using mel frequency cepstral coefficients (MFCCs), and it is captured by using Gaussian mixture models (GMMs). The proposed two-stage segmentation method is evaluated on manual segmented ten broadcast TV news bulletins. From the evaluation results, it is observed that about 93% of the news stories are correctly segmented, 7% are missed and 11% are spurious.