{"title":"使用键查询进行有效的查询混淆","authors":"Maik Fröbe, Eric Schmidt, Matthias Hagen","doi":"10.1145/3486622.3493950","DOIUrl":null,"url":null,"abstract":"Search engine users who do not want a sensitive query to actually appear in a search engine’s query log can use query obfuscation or scrambling techniques to keep their information need private. However, the practical applicability of the state-of-the-art obfuscation technique is rather limited since it compares hundreds of thousands of candidate queries on a local corpus to select the final obfuscated queries. We propose a new approach to query obfuscation combining an efficient enumeration algorithm with so-called keyqueries. Generating only hundreds of candidate queries, our approach is orders of magnitude faster and makes close to real-time obfuscation of sensitive information needs feasible. Our experiments in TREC scenarios on the ClueWeb corpora show that our approach achieves a retrieval effectiveness comparable to the previous exhaustive candidate generation at a run time of only seconds instead of hours. Overall, 75% of the private information needs can be obfuscated while retrieving at least one relevant document of the original private query—that itself will not appear in the search engine logs. To further improve a user’s privacy, the query obfuscation can easily be combined with other client-side tools like TrackMeNot or PEAS fake queries, and TOR routing.","PeriodicalId":89230,"journal":{"name":"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"82 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Efficient Query Obfuscation with Keyqueries\",\"authors\":\"Maik Fröbe, Eric Schmidt, Matthias Hagen\",\"doi\":\"10.1145/3486622.3493950\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Search engine users who do not want a sensitive query to actually appear in a search engine’s query log can use query obfuscation or scrambling techniques to keep their information need private. However, the practical applicability of the state-of-the-art obfuscation technique is rather limited since it compares hundreds of thousands of candidate queries on a local corpus to select the final obfuscated queries. We propose a new approach to query obfuscation combining an efficient enumeration algorithm with so-called keyqueries. Generating only hundreds of candidate queries, our approach is orders of magnitude faster and makes close to real-time obfuscation of sensitive information needs feasible. Our experiments in TREC scenarios on the ClueWeb corpora show that our approach achieves a retrieval effectiveness comparable to the previous exhaustive candidate generation at a run time of only seconds instead of hours. Overall, 75% of the private information needs can be obfuscated while retrieving at least one relevant document of the original private query—that itself will not appear in the search engine logs. To further improve a user’s privacy, the query obfuscation can easily be combined with other client-side tools like TrackMeNot or PEAS fake queries, and TOR routing.\",\"PeriodicalId\":89230,\"journal\":{\"name\":\"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"82 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3486622.3493950\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3486622.3493950","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Search engine users who do not want a sensitive query to actually appear in a search engine’s query log can use query obfuscation or scrambling techniques to keep their information need private. However, the practical applicability of the state-of-the-art obfuscation technique is rather limited since it compares hundreds of thousands of candidate queries on a local corpus to select the final obfuscated queries. We propose a new approach to query obfuscation combining an efficient enumeration algorithm with so-called keyqueries. Generating only hundreds of candidate queries, our approach is orders of magnitude faster and makes close to real-time obfuscation of sensitive information needs feasible. Our experiments in TREC scenarios on the ClueWeb corpora show that our approach achieves a retrieval effectiveness comparable to the previous exhaustive candidate generation at a run time of only seconds instead of hours. Overall, 75% of the private information needs can be obfuscated while retrieving at least one relevant document of the original private query—that itself will not appear in the search engine logs. To further improve a user’s privacy, the query obfuscation can easily be combined with other client-side tools like TrackMeNot or PEAS fake queries, and TOR routing.