Kickstarting the Commons: The YFCC100M and the YLI Corpora

Julia Bernd, Damian Borth, C. Carrano, Jaeyoung Choi, Benjamin Elizalde, G. Friedland, L. Gottlieb, Karl S. Ni, R. Pearce, Douglas N. Poland, Khalid Ashraf, David A. Shamma, B. Thomee
{"title":"Kickstarting the Commons: The YFCC100M and the YLI Corpora","authors":"Julia Bernd, Damian Borth, C. Carrano, Jaeyoung Choi, Benjamin Elizalde, G. Friedland, L. Gottlieb, Karl S. Ni, R. Pearce, Douglas N. Poland, Khalid Ashraf, David A. Shamma, B. Thomee","doi":"10.1145/2814815.2816986","DOIUrl":null,"url":null,"abstract":"The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.","PeriodicalId":215083,"journal":{"name":"MMCommons '15","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MMCommons '15","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2814815.2816986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

The publication of the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M)--to date the largest open-access collection of photos and videos--has provided a unique opportunity to stimulate new research in multimedia analysis and retrieval. To make the YFCC100M even more valuable, we have started working towards supplementing it with a comprehensive set of precomputed features and high-quality ground truth annotations. As part of our efforts, we are releasing the YLI feature corpus, as well as the YLI-GEO and YLI-MED annotation subsets. Under the Multimedia Commons Project (MMCP), we are currently laying the groundwork for a common platform and framework around the YFCC100M that (i) facilitates researchers in contributing additional features and annotations, (ii) supports experimentation on the dataset, and (iii) enables sharing of obtained results. This paper describes the YLI features and annotations released thus far, and sketches our vision for the MMCP.
启动公共资源:YFCC100M和YLI语料库
雅虎Flickr知识共享1亿数据集(YFCC100M)的发布——迄今为止最大的开放获取的照片和视频集合——为激发多媒体分析和检索方面的新研究提供了一个独特的机会。为了使YFCC100M更有价值,我们已经开始努力补充一套全面的预计算功能和高质量的地面真值注释。作为我们努力的一部分,我们正在发布YLI特征语料库,以及YLI- geo和YLI- med注释子集。在多媒体共享项目(MMCP)下,我们目前正在为围绕YFCC100M的通用平台和框架奠定基础,该平台和框架(i)便于研究人员提供额外的功能和注释,(ii)支持对数据集的实验,以及(iii)能够共享获得的结果。本文描述了迄今为止发布的YLI特性和注释,并概述了我们对MMCP的愿景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信