Kickstarting the Commons: The YFCC100M and the YLI Corpora

MMCommons '15 Pub Date : 2015-10-30 DOI:10.1145/2814815.2816986

Julia Bernd, Damian Borth, C. Carrano, Jaeyoung Choi, Benjamin Elizalde, G. Friedland, L. Gottlieb, Karl S. Ni, R. Pearce, Douglas N. Poland, Khalid Ashraf, David A. Shamma, B. Thomee

引用次数: 15

Abstract

The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.

查看原文本刊更多论文

启动公共资源:YFCC100M和YLI语料库

雅虎Flickr知识共享1亿数据集(YFCC100M)的发布——迄今为止最大的开放获取的照片和视频集合——为激发多媒体分析和检索方面的新研究提供了一个独特的机会。为了使YFCC100M更有价值，我们已经开始努力补充一套全面的预计算功能和高质量的地面真值注释。作为我们努力的一部分，我们正在发布YLI特征语料库，以及YLI- geo和YLI- med注释子集。在多媒体共享项目(MMCP)下，我们目前正在为围绕YFCC100M的通用平台和框架奠定基础，该平台和框架(i)便于研究人员提供额外的功能和注释，(ii)支持对数据集的实验，以及(iii)能够共享获得的结果。本文描述了迄今为止发布的YLI特性和注释，并概述了我们对MMCP的愿景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

MMCommons '15

自引率

0.00%

发文量