Multi-Modal Learning: Study on A Large-Scale Micro-Video Data Collection

Proceedings of the 24th ACM international conference on Multimedia Pub Date : 2016-10-01 DOI:10.1145/2964284.2971477

Jingyuan Chen

引用次数: 15

Abstract

Micro-video sharing social services, as a new phenomenon in social media, enable users to share micro-videos and thus gain increasing enthusiasm among people. One distinct characteristic of micro-videos is the multi-modality, as these videos always have visual signals, audio tracks, textual descriptions as well as social clues. Such multi-modality data makes it possible to obtain a comprehensive understanding of videos and hence provides new opportunities for researchers. However, limited efforts thus far have been dedicated to this new emerging user-generated contents (UGCs) due to the lack of large-scale benchmark dataset. Towards this end, in this paper, we construct a large-scale micro-video dataset, which can support many research domains, such as popularity prediction and venue estimation. Based upon this dataset, we conduct an initial study in popularity prediction of micro-videos. Finally, we identify our future work.

查看原文本刊更多论文

多模态学习:大规模微视频数据采集研究

微视频分享社交服务作为社交媒体中的一种新现象，使用户能够分享微视频，从而获得越来越多的人们的热情。微视频的一个显著特征是多模态，这些视频总是有视觉信号、音轨、文字描述以及社会线索。这种多模态数据使得对视频的全面理解成为可能，从而为研究人员提供了新的机会。然而，由于缺乏大规模的基准数据集，迄今为止，致力于这种新兴的用户生成内容(UGCs)的努力有限。为此，在本文中，我们构建了一个大规模的微视频数据集，该数据集可以支持许多研究领域，如人气预测和场地估计。基于此数据集，我们对微视频的热度预测进行了初步的研究。最后，我们确定了未来的工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 24th ACM international conference on Multimedia

自引率

0.00%

发文量