Daniel Schneider, Sebastian Tschöpel, J. Schwenninger
{"title":"基于语音识别的社交推荐:在社交网络中分享电视场景","authors":"Daniel Schneider, Sebastian Tschöpel, J. Schwenninger","doi":"10.1109/WIAMIS.2012.6226755","DOIUrl":null,"url":null,"abstract":"We describe a novel system which simplifies recommendation of video scenes in social networks, thereby attracting a new audience for existing video portals. Users can select interesting quotes from a speech recognition transcript, and share the corresponding video scene with their social circle with minimal effort. The system has been designed in close cooperation with the largest German public broadcaster (ARD), and was deployed at the broadcaster's public video portal. A twofold adaptation strategy adapts our speech recognition system to the given use case. First, a database of speaker-adapted acoustic models for the most important speakers in the corpus is created. We use spectral speaker identification for detecting whether one of these speakers is speaking, and select the corresponding model accordingly. Second, we apply language model adaptation by exploiting prior knowledge about the video category.","PeriodicalId":346777,"journal":{"name":"2012 13th International Workshop on Image Analysis for Multimedia Interactive Services","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Social recommendation using speech recognition: Sharing TV scenes in social networks\",\"authors\":\"Daniel Schneider, Sebastian Tschöpel, J. Schwenninger\",\"doi\":\"10.1109/WIAMIS.2012.6226755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We describe a novel system which simplifies recommendation of video scenes in social networks, thereby attracting a new audience for existing video portals. Users can select interesting quotes from a speech recognition transcript, and share the corresponding video scene with their social circle with minimal effort. The system has been designed in close cooperation with the largest German public broadcaster (ARD), and was deployed at the broadcaster's public video portal. A twofold adaptation strategy adapts our speech recognition system to the given use case. First, a database of speaker-adapted acoustic models for the most important speakers in the corpus is created. We use spectral speaker identification for detecting whether one of these speakers is speaking, and select the corresponding model accordingly. Second, we apply language model adaptation by exploiting prior knowledge about the video category.\",\"PeriodicalId\":346777,\"journal\":{\"name\":\"2012 13th International Workshop on Image Analysis for Multimedia Interactive Services\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 13th International Workshop on Image Analysis for Multimedia Interactive Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WIAMIS.2012.6226755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 13th International Workshop on Image Analysis for Multimedia Interactive Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIAMIS.2012.6226755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Social recommendation using speech recognition: Sharing TV scenes in social networks
We describe a novel system which simplifies recommendation of video scenes in social networks, thereby attracting a new audience for existing video portals. Users can select interesting quotes from a speech recognition transcript, and share the corresponding video scene with their social circle with minimal effort. The system has been designed in close cooperation with the largest German public broadcaster (ARD), and was deployed at the broadcaster's public video portal. A twofold adaptation strategy adapts our speech recognition system to the given use case. First, a database of speaker-adapted acoustic models for the most important speakers in the corpus is created. We use spectral speaker identification for detecting whether one of these speakers is speaking, and select the corresponding model accordingly. Second, we apply language model adaptation by exploiting prior knowledge about the video category.