On the effectiveness of anonymizing networks for web search privacy

Sai Teja Peddinti, Nitesh Saxena
{"title":"On the effectiveness of anonymizing networks for web search privacy","authors":"Sai Teja Peddinti, Nitesh Saxena","doi":"10.1145/1966913.1966984","DOIUrl":null,"url":null,"abstract":"Web search has emerged as one of the most important applications on the internet, with several search engines available to the users. There is a common practice among these search engines to log and analyse the user queries, which leads to serious privacy implications. One well known solution to search privacy involves issuing the queries via an anonymizing network, such as Tor, thereby hiding one's identity from the search engine. A fundamental problem with this solution, however, is that user queries are still obviously revealed to the search engine, although they are \"mixed\" among the queries issued by other users of the same anonymization service.\n In this paper, we consider the problem of identifying the queries of a user of interest (UOI) within a pool of queries received by a search engine over an anonymizing network. We demonstrate that an adversarial search engine can extract the UOI's queries, when it is equipped with only a short-term user search query history, by utilizing only the query content information and off-the-shelf machine learning classifiers. More specifically, by treating a selected set of 60 users --- from the publicly-available AOL search logs --- as the users of interest performing web search over an anonymizing network, we show that each user's queries can be identified with 25.95% average accuracy, when mixed with queries of 99 other users of the anonymization service. This average accuracy drops to 18.95% when queries of 999 other users of the anonymization service are mixed together. Though the average accuracies are not so high, our results indicate that few users of interest could be identified with accuracies as high as 80--98%, even when their queries are mixed among queries of 999 other users. Our results cast serious doubts on the effectiveness of anonymizing web search queries by means of anonymizing networks.","PeriodicalId":72308,"journal":{"name":"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1966913.1966984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Web search has emerged as one of the most important applications on the internet, with several search engines available to the users. There is a common practice among these search engines to log and analyse the user queries, which leads to serious privacy implications. One well known solution to search privacy involves issuing the queries via an anonymizing network, such as Tor, thereby hiding one's identity from the search engine. A fundamental problem with this solution, however, is that user queries are still obviously revealed to the search engine, although they are "mixed" among the queries issued by other users of the same anonymization service. In this paper, we consider the problem of identifying the queries of a user of interest (UOI) within a pool of queries received by a search engine over an anonymizing network. We demonstrate that an adversarial search engine can extract the UOI's queries, when it is equipped with only a short-term user search query history, by utilizing only the query content information and off-the-shelf machine learning classifiers. More specifically, by treating a selected set of 60 users --- from the publicly-available AOL search logs --- as the users of interest performing web search over an anonymizing network, we show that each user's queries can be identified with 25.95% average accuracy, when mixed with queries of 99 other users of the anonymization service. This average accuracy drops to 18.95% when queries of 999 other users of the anonymization service are mixed together. Though the average accuracies are not so high, our results indicate that few users of interest could be identified with accuracies as high as 80--98%, even when their queries are mixed among queries of 999 other users. Our results cast serious doubts on the effectiveness of anonymizing web search queries by means of anonymizing networks.
网络匿名化对网络搜索隐私保护的有效性研究
网络搜索已经成为互联网上最重要的应用之一,有几个搜索引擎可供用户使用。在这些搜索引擎中,有一种常见的做法是记录和分析用户查询,这会导致严重的隐私问题。搜索隐私的一个众所周知的解决方案是通过匿名网络(如Tor)发出查询,从而对搜索引擎隐藏自己的身份。然而,这种解决方案的一个基本问题是,用户的查询仍然明显地显示给搜索引擎,尽管它们“混合”在同一匿名化服务的其他用户发出的查询中。在本文中,我们考虑了在匿名网络上搜索引擎接收的查询池中识别感兴趣用户(UOI)查询的问题。我们证明了对抗性搜索引擎可以通过仅利用查询内容信息和现成的机器学习分类器,在仅配备短期用户搜索查询历史的情况下提取UOI的查询。更具体地说,通过从公开的AOL搜索日志中选择60个用户作为在匿名网络上执行网络搜索的用户,我们表明,当与匿名服务的99个其他用户的查询混合在一起时,每个用户的查询可以以25.95%的平均准确率识别。当999个其他匿名化服务用户的查询混合在一起时,平均准确率下降到18.95%。虽然平均准确率不是很高,但我们的结果表明,即使他们的查询与999个其他用户的查询混合在一起,也很少有感兴趣的用户可以被识别出准确率高达80% -98%的用户。我们的研究结果对通过匿名化网络来匿名化网络搜索查询的有效性提出了严重的质疑。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信