Benjamin Bigot, I. Ferrané, J. Pinquier, R. André-Obrecht
{"title":"说话者角色识别,以帮助自发的会话语音检测","authors":"Benjamin Bigot, I. Ferrané, J. Pinquier, R. André-Obrecht","doi":"10.1145/1878101.1878104","DOIUrl":null,"url":null,"abstract":"In the audio indexing context, we present our recent contributions to the field of speaker role recognition, especially applied to conversational speech.\n We assume that there exist clues about roles like Anchor, Journalists or Others in temporal, acoustic and prosodic features extracted from the results of speaker segmentation and from audio files. In this paper, investigations are done on the EPAC corpus, mainly containing conversational documents. First, an automatic clustering approach is used to validate the proposed features and the role definitions. In a second study we propose a hierarchical supervised classification system. The use of dimensionality reduction methods as well as feature selection are investigated. This system correctly classifies 92% of speaker roles","PeriodicalId":123226,"journal":{"name":"SSCS '10","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Speaker role recognition to help spontaneous conversational speech detection\",\"authors\":\"Benjamin Bigot, I. Ferrané, J. Pinquier, R. André-Obrecht\",\"doi\":\"10.1145/1878101.1878104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the audio indexing context, we present our recent contributions to the field of speaker role recognition, especially applied to conversational speech.\\n We assume that there exist clues about roles like Anchor, Journalists or Others in temporal, acoustic and prosodic features extracted from the results of speaker segmentation and from audio files. In this paper, investigations are done on the EPAC corpus, mainly containing conversational documents. First, an automatic clustering approach is used to validate the proposed features and the role definitions. In a second study we propose a hierarchical supervised classification system. The use of dimensionality reduction methods as well as feature selection are investigated. This system correctly classifies 92% of speaker roles\",\"PeriodicalId\":123226,\"journal\":{\"name\":\"SSCS '10\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SSCS '10\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1878101.1878104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SSCS '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1878101.1878104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speaker role recognition to help spontaneous conversational speech detection
In the audio indexing context, we present our recent contributions to the field of speaker role recognition, especially applied to conversational speech.
We assume that there exist clues about roles like Anchor, Journalists or Others in temporal, acoustic and prosodic features extracted from the results of speaker segmentation and from audio files. In this paper, investigations are done on the EPAC corpus, mainly containing conversational documents. First, an automatic clustering approach is used to validate the proposed features and the role definitions. In a second study we propose a hierarchical supervised classification system. The use of dimensionality reduction methods as well as feature selection are investigated. This system correctly classifies 92% of speaker roles