{"title":"微观告诉宏观:通过转换模型预测微视频的流行","authors":"Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, Tat-Seng Chua","doi":"10.1145/2964284.2964314","DOIUrl":null,"url":null,"abstract":"Micro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of traditional UGCs including tweets, web images, and long videos, has achieved good theoretical underpinnings and great practical success. However, little research has thus far been conducted to predict the popularity of the bite-sized videos. This task is non-trivial due to three reasons: 1) micro-videos are short in duration and of low quality; 2) they can be described by multiple heterogeneous channels, spanning from social, visual, acoustic to textual modalities; and 3) there are no available benchmark dataset and discriminant features that are suitable for this task. Towards this end, we present a transductive multi-modal learning model. The proposed model is designed to find the optimal latent common space, unifying and preserving information from different modalities, whereby micro-videos can be better represented. This latent space can be used to alleviate the information insufficiency problem caused by the brief nature of micro-videos. In addition, we built a benchmark dataset and extracted a rich set of popularity-oriented features to characterize the popular micro-videos. Extensive experiments have demonstrated the effectiveness of the proposed model. As a side contribution, we have released the dataset, codes and parameters to facilitate other researchers.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"131","resultStr":"{\"title\":\"Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model\",\"authors\":\"Jingyuan Chen, Xuemeng Song, Liqiang Nie, Xiang Wang, Hanwang Zhang, Tat-Seng Chua\",\"doi\":\"10.1145/2964284.2964314\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Micro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of traditional UGCs including tweets, web images, and long videos, has achieved good theoretical underpinnings and great practical success. However, little research has thus far been conducted to predict the popularity of the bite-sized videos. This task is non-trivial due to three reasons: 1) micro-videos are short in duration and of low quality; 2) they can be described by multiple heterogeneous channels, spanning from social, visual, acoustic to textual modalities; and 3) there are no available benchmark dataset and discriminant features that are suitable for this task. Towards this end, we present a transductive multi-modal learning model. The proposed model is designed to find the optimal latent common space, unifying and preserving information from different modalities, whereby micro-videos can be better represented. This latent space can be used to alleviate the information insufficiency problem caused by the brief nature of micro-videos. In addition, we built a benchmark dataset and extracted a rich set of popularity-oriented features to characterize the popular micro-videos. Extensive experiments have demonstrated the effectiveness of the proposed model. As a side contribution, we have released the dataset, codes and parameters to facilitate other researchers.\",\"PeriodicalId\":140670,\"journal\":{\"name\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"131\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2964284.2964314\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2964314","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model
Micro-videos, a new form of user generated contents (UGCs), are gaining increasing enthusiasm. Popular micro-videos have enormous commercial potential in many ways, such as online marketing and brand tracking. In fact, the popularity prediction of traditional UGCs including tweets, web images, and long videos, has achieved good theoretical underpinnings and great practical success. However, little research has thus far been conducted to predict the popularity of the bite-sized videos. This task is non-trivial due to three reasons: 1) micro-videos are short in duration and of low quality; 2) they can be described by multiple heterogeneous channels, spanning from social, visual, acoustic to textual modalities; and 3) there are no available benchmark dataset and discriminant features that are suitable for this task. Towards this end, we present a transductive multi-modal learning model. The proposed model is designed to find the optimal latent common space, unifying and preserving information from different modalities, whereby micro-videos can be better represented. This latent space can be used to alleviate the information insufficiency problem caused by the brief nature of micro-videos. In addition, we built a benchmark dataset and extracted a rich set of popularity-oriented features to characterize the popular micro-videos. Extensive experiments have demonstrated the effectiveness of the proposed model. As a side contribution, we have released the dataset, codes and parameters to facilitate other researchers.