On the feasibility of crawling-based attacks against recommender systems

J. Comput. Secur. Pub Date : 2021-11-04 DOI:10.3233/jcs-210041

F. Aiolli, M. Conti, S. Picek, Mirko Polato

{"title":"On the feasibility of crawling-based attacks against recommender systems","authors":"F. Aiolli, M. Conti, S. Picek, Mirko Polato","doi":"10.3233/jcs-210041","DOIUrl":null,"url":null,"abstract":"Nowadays, online services, like e-commerce or streaming services, provide a personalized user experience through recommender systems. Recommender systems are built upon a vast amount of data about users/items acquired by the services. Such knowledge represents an invaluable resource. However, commonly, part of this knowledge is public and can be easily accessed via the Internet. Unfortunately, that same knowledge can be leveraged by competitors or malicious users. The literature offers a large number of works concerning attacks on recommender systems, but most of them assume that the attacker can easily access the full rating matrix. In practice, this is never the case. The only way to access the rating matrix is by gathering the ratings (e.g., reviews) by crawling the service’s website. Crawling a website has a cost in terms of time and resources. What is more, the targeted website can employ defensive measures to detect automatic scraping. In this paper, we assess the impact of a series of attacks on recommender systems. Our analysis aims to set up the most realistic scenarios considering both the possibilities and the potential attacker’s limitations. In particular, we assess the impact of different crawling approaches when attacking a recommendation service. From the collected information, we mount various profile injection attacks. We measure the value of the collected knowledge through the identification of the most similar user/item. Our empirical results show that while crawling can indeed bring knowledge to the attacker (up to 65% of neighborhood reconstruction on a mid-size dataset and up to 90% on a small-size dataset), this will not be enough to mount a successful shilling attack in practice.","PeriodicalId":142580,"journal":{"name":"J. Comput. Secur.","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Comput. Secur.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jcs-210041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Nowadays, online services, like e-commerce or streaming services, provide a personalized user experience through recommender systems. Recommender systems are built upon a vast amount of data about users/items acquired by the services. Such knowledge represents an invaluable resource. However, commonly, part of this knowledge is public and can be easily accessed via the Internet. Unfortunately, that same knowledge can be leveraged by competitors or malicious users. The literature offers a large number of works concerning attacks on recommender systems, but most of them assume that the attacker can easily access the full rating matrix. In practice, this is never the case. The only way to access the rating matrix is by gathering the ratings (e.g., reviews) by crawling the service’s website. Crawling a website has a cost in terms of time and resources. What is more, the targeted website can employ defensive measures to detect automatic scraping. In this paper, we assess the impact of a series of attacks on recommender systems. Our analysis aims to set up the most realistic scenarios considering both the possibilities and the potential attacker’s limitations. In particular, we assess the impact of different crawling approaches when attacking a recommendation service. From the collected information, we mount various profile injection attacks. We measure the value of the collected knowledge through the identification of the most similar user/item. Our empirical results show that while crawling can indeed bring knowledge to the attacker (up to 65% of neighborhood reconstruction on a mid-size dataset and up to 90% on a small-size dataset), this will not be enough to mount a successful shilling attack in practice.

查看原文本刊更多论文

基于爬虫攻击推荐系统的可行性研究

如今，在线服务，如电子商务或流媒体服务，通过推荐系统提供个性化的用户体验。推荐系统建立在服务获取的大量用户/项目数据的基础上。这种知识是一种无价的资源。然而，通常，这些知识的一部分是公开的，可以很容易地通过Internet访问。不幸的是，同样的知识可以被竞争对手或恶意用户利用。文献提供了大量关于攻击推荐系统的工作，但其中大多数假设攻击者可以轻松访问完整的评级矩阵。在实践中，情况并非如此。访问评级矩阵的唯一方法是通过抓取服务网站来收集评级(例如，评论)。抓取网站需要花费时间和资源。更重要的是，目标网站可以采用防御措施来检测自动抓取。在本文中，我们评估了一系列攻击对推荐系统的影响。我们的分析旨在考虑可能性和潜在攻击者的限制，建立最现实的场景。特别是，我们在攻击推荐服务时评估了不同爬行方法的影响。根据收集到的信息，我们发动了各种配置文件注入攻击。我们通过识别最相似的用户/项目来衡量收集到的知识的价值。我们的经验结果表明，虽然爬行确实可以为攻击者带来知识(在中等规模的数据集上高达65%的邻域重建，在小型数据集上高达90%)，但这还不足以在实践中成功地发动一场攻击。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

J. Comput. Secur.

自引率

0.00%

发文量