不仅仅是单词:播客的非文本特征建模

Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining Pub Date : 2019-01-30 DOI:10.1145/3289600.3290993

Longqi Yang, Yu Wang, D. Dunne, Michael Sobolev, Mor Naaman, D. Estrin

{"title":"不仅仅是单词:播客的非文本特征建模","authors":"Longqi Yang, Yu Wang, D. Dunne, Michael Sobolev, Mor Naaman, D. Estrin","doi":"10.1145/3289600.3290993","DOIUrl":null,"url":null,"abstract":"Recent years have witnessed the flourishing of podcasts, a unique type of audio medium. Prior work on podcast content modeling focused on analyzing Automatic Speech Recognition outputs, which ignored vocal, musical, and conversational properties (e.g., energy, humor, and creativity) that uniquely characterize this medium. In this paper, we present an Adversarial Learning-based Podcast Representation (ALPR) that captures non-textual aspects of podcasts. Through extensive experiments on a large-scale podcast dataset (88,728 episodes from 18,433 channels), we show that (1) ALPR significantly outperforms the state-of-the-art features developed for music and speech in predicting theseriousness andenergy of podcasts, and (2) incorporating ALPR significantly improves the performance of topic-based podcast-popularity prediction. Our experiments also reveal factors that correlate with podcast popularity.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"More Than Just Words: Modeling Non-Textual Characteristics of Podcasts\",\"authors\":\"Longqi Yang, Yu Wang, D. Dunne, Michael Sobolev, Mor Naaman, D. Estrin\",\"doi\":\"10.1145/3289600.3290993\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years have witnessed the flourishing of podcasts, a unique type of audio medium. Prior work on podcast content modeling focused on analyzing Automatic Speech Recognition outputs, which ignored vocal, musical, and conversational properties (e.g., energy, humor, and creativity) that uniquely characterize this medium. In this paper, we present an Adversarial Learning-based Podcast Representation (ALPR) that captures non-textual aspects of podcasts. Through extensive experiments on a large-scale podcast dataset (88,728 episodes from 18,433 channels), we show that (1) ALPR significantly outperforms the state-of-the-art features developed for music and speech in predicting theseriousness andenergy of podcasts, and (2) incorporating ALPR significantly improves the performance of topic-based podcast-popularity prediction. Our experiments also reveal factors that correlate with podcast popularity.\",\"PeriodicalId\":143253,\"journal\":{\"name\":\"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3289600.3290993\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3289600.3290993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

播客是一种独特的音频媒体，近年来蓬勃发展。之前关于播客内容建模的工作主要集中在分析自动语音识别输出，而忽略了这种媒体特有的声乐、音乐和会话属性(例如，能量、幽默和创造力)。在本文中，我们提出了一种基于对抗性学习的播客表示(ALPR)，它可以捕获播客的非文本方面。通过大规模播客数据集(来自18433个频道的88,728集)的广泛实验，我们表明:(1)ALPR在预测播客的严重性和能量方面显著优于为音乐和语音开发的最先进特征，(2)结合ALPR显著提高了基于主题的播客流行度预测的性能。我们的实验还揭示了与播客受欢迎程度相关的因素。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

More Than Just Words: Modeling Non-Textual Characteristics of Podcasts

Recent years have witnessed the flourishing of podcasts, a unique type of audio medium. Prior work on podcast content modeling focused on analyzing Automatic Speech Recognition outputs, which ignored vocal, musical, and conversational properties (e.g., energy, humor, and creativity) that uniquely characterize this medium. In this paper, we present an Adversarial Learning-based Podcast Representation (ALPR) that captures non-textual aspects of podcasts. Through extensive experiments on a large-scale podcast dataset (88,728 episodes from 18,433 channels), we show that (1) ALPR significantly outperforms the state-of-the-art features developed for music and speech in predicting theseriousness andenergy of podcasts, and (2) incorporating ALPR significantly improves the performance of topic-based podcast-popularity prediction. Our experiments also reveal factors that correlate with podcast popularity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量