Vis2Rec:一个用于访问推荐的大规模可视化数据集

Michael Soumm, Adrian Daniel Popescu, Bertrand Delezoide
{"title":"Vis2Rec:一个用于访问推荐的大规模可视化数据集","authors":"Michael Soumm, Adrian Daniel Popescu, Bertrand Delezoide","doi":"10.1109/WACV56688.2023.00300","DOIUrl":null,"url":null,"abstract":"Most recommendation datasets for tourism are restricted to one world region and rely on explicit data such as checkins. However, in reality, tourists visit various places world-wide and document their trips primarily through photos. These images contain a wealth of raw information that can be used to capture users’ preferences and recommend personalized content. Visual content was already used in past works, but no large-scale publicly-available dataset that gives access to users’ personal images exists for recommender systems. As such a resource would open-up possibilities for new image-based recommendation algorithms, we introduce Vis2Rec, a new dataset based on visit data extracted from users’ Flickr photographic streams, which includes over 7 million photos, 36k recognizable points of interest, and 14k user profiles. Google Landmarks v2 is used as an auxiliary dataset to identify points of interest in users’ photos, using a state-of-the-art image-matching deep architecture. Image-based user profiles are then constituted by aggregating the points of interest detected for each user. In addition, ground truth visits were determined for the test subset in order to enable accurate evaluation. Finally, we benchmark Vis2Rec using various existing recommender systems, and discuss the possibilities opened up by the availability of user images, as well as the societal issues that come with them. Following good practice in dataset sharing, Vis2Rec is created using only freely distributable content, and additional anonymization is performed to ensure the privacy of users. The raw dataset and the preprocessed user profiles will be publicly available at https://github.com/MSoumm/Vis2Rec.","PeriodicalId":270631,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Vis2Rec: A Large-Scale Visual Dataset for Visit Recommendation\",\"authors\":\"Michael Soumm, Adrian Daniel Popescu, Bertrand Delezoide\",\"doi\":\"10.1109/WACV56688.2023.00300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most recommendation datasets for tourism are restricted to one world region and rely on explicit data such as checkins. However, in reality, tourists visit various places world-wide and document their trips primarily through photos. These images contain a wealth of raw information that can be used to capture users’ preferences and recommend personalized content. Visual content was already used in past works, but no large-scale publicly-available dataset that gives access to users’ personal images exists for recommender systems. As such a resource would open-up possibilities for new image-based recommendation algorithms, we introduce Vis2Rec, a new dataset based on visit data extracted from users’ Flickr photographic streams, which includes over 7 million photos, 36k recognizable points of interest, and 14k user profiles. Google Landmarks v2 is used as an auxiliary dataset to identify points of interest in users’ photos, using a state-of-the-art image-matching deep architecture. Image-based user profiles are then constituted by aggregating the points of interest detected for each user. In addition, ground truth visits were determined for the test subset in order to enable accurate evaluation. Finally, we benchmark Vis2Rec using various existing recommender systems, and discuss the possibilities opened up by the availability of user images, as well as the societal issues that come with them. Following good practice in dataset sharing, Vis2Rec is created using only freely distributable content, and additional anonymization is performed to ensure the privacy of users. The raw dataset and the preprocessed user profiles will be publicly available at https://github.com/MSoumm/Vis2Rec.\",\"PeriodicalId\":270631,\"journal\":{\"name\":\"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV56688.2023.00300\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV56688.2023.00300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

大多数旅游推荐数据集仅限于一个世界地区,并依赖于签到等明确的数据。然而,在现实中,游客参观世界各地,并主要通过照片记录他们的旅行。这些图像包含丰富的原始信息,可用于捕获用户的偏好并推荐个性化内容。在过去的作品中已经使用了视觉内容,但是没有大规模的公开可用的数据集,可以访问用户的个人图像,供推荐系统使用。由于这样的资源将为新的基于图像的推荐算法开辟可能性,我们介绍了Vis2Rec,这是一个基于从用户Flickr照片流中提取的访问数据的新数据集,其中包括超过700万张照片,36k可识别的兴趣点和14k用户个人资料。Google Landmarks v2被用作辅助数据集,使用最先进的图像匹配深度架构来识别用户照片中的兴趣点。然后通过聚合为每个用户检测到的兴趣点来构建基于图像的用户配置文件。此外,为测试子集确定了地面真值访问,以便进行准确的评估。最后,我们使用各种现有的推荐系统对Vis2Rec进行基准测试,并讨论用户图像可用性带来的可能性,以及随之而来的社会问题。遵循数据集共享的良好实践,Vis2Rec仅使用可自由分发的内容创建,并执行额外的匿名化以确保用户的隐私。原始数据集和预处理的用户配置文件将在https://github.com/MSoumm/Vis2Rec上公开提供。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Vis2Rec: A Large-Scale Visual Dataset for Visit Recommendation
Most recommendation datasets for tourism are restricted to one world region and rely on explicit data such as checkins. However, in reality, tourists visit various places world-wide and document their trips primarily through photos. These images contain a wealth of raw information that can be used to capture users’ preferences and recommend personalized content. Visual content was already used in past works, but no large-scale publicly-available dataset that gives access to users’ personal images exists for recommender systems. As such a resource would open-up possibilities for new image-based recommendation algorithms, we introduce Vis2Rec, a new dataset based on visit data extracted from users’ Flickr photographic streams, which includes over 7 million photos, 36k recognizable points of interest, and 14k user profiles. Google Landmarks v2 is used as an auxiliary dataset to identify points of interest in users’ photos, using a state-of-the-art image-matching deep architecture. Image-based user profiles are then constituted by aggregating the points of interest detected for each user. In addition, ground truth visits were determined for the test subset in order to enable accurate evaluation. Finally, we benchmark Vis2Rec using various existing recommender systems, and discuss the possibilities opened up by the availability of user images, as well as the societal issues that come with them. Following good practice in dataset sharing, Vis2Rec is created using only freely distributable content, and additional anonymization is performed to ensure the privacy of users. The raw dataset and the preprocessed user profiles will be publicly available at https://github.com/MSoumm/Vis2Rec.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信