{"title":"Improved website fingerprinting on Tor","authors":"Tao Wang, I. Goldberg","doi":"10.1145/2517840.2517851","DOIUrl":null,"url":null,"abstract":"In this paper, we propose new website fingerprinting techniques that achieve a higher classification accuracy on Tor than previous works. We describe our novel methodology for gathering data on Tor; this methodology is essential for accurate classifier comparison and analysis. We offer new ways to interpret the data by using the more fundamental Tor cells as a unit of data rather than TCP/IP packets. We demonstrate an experimental method to remove Tor SENDMEs, which are control cells that provide no useful data, in order to improve accuracy. We also propose a new set of metrics to describe the similarity between two traffic instances; they are derived from observations on how a site is loaded. Using our new metrics we achieve a higher success rate than previous authors. We conduct a thorough analysis and comparison between our new algorithms and the previous best algorithm. To identify the potential power of website fingerprinting on Tor, we perform open-world experiments; we achieve a recall rate over 95% and a false positive rate under 0.2% for several potentially monitored sites, which far exceeds previous reported recall rates. In the closed-world experiments, our accuracy is 91%, as compared to 86-87% from the best previous classifier on the same data.","PeriodicalId":406846,"journal":{"name":"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"273","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2517840.2517851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 273
Abstract
In this paper, we propose new website fingerprinting techniques that achieve a higher classification accuracy on Tor than previous works. We describe our novel methodology for gathering data on Tor; this methodology is essential for accurate classifier comparison and analysis. We offer new ways to interpret the data by using the more fundamental Tor cells as a unit of data rather than TCP/IP packets. We demonstrate an experimental method to remove Tor SENDMEs, which are control cells that provide no useful data, in order to improve accuracy. We also propose a new set of metrics to describe the similarity between two traffic instances; they are derived from observations on how a site is loaded. Using our new metrics we achieve a higher success rate than previous authors. We conduct a thorough analysis and comparison between our new algorithms and the previous best algorithm. To identify the potential power of website fingerprinting on Tor, we perform open-world experiments; we achieve a recall rate over 95% and a false positive rate under 0.2% for several potentially monitored sites, which far exceeds previous reported recall rates. In the closed-world experiments, our accuracy is 91%, as compared to 86-87% from the best previous classifier on the same data.