{"title":"通过访问HTTP网页的相关性分析攻击HTTPS安全搜索服务","authors":"Qian Liping, Wang Lidong","doi":"10.14257/IJSIA.2017.11.7.03","DOIUrl":null,"url":null,"abstract":"It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, a search engine can adopt HTTPS-by-default to provide bidirectional encryption to protect its users’ privacy. Since the majority of webpages indexed in search engine’s results pages are still on HTTP-enabled websites and the contents of these webpages can be observed by attackers once the user click on the indexed web-links. We propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages the user visits subsequently. We show that a simple weighted TF-DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 180 Chinese and English search phrases, we achieved 67.78% hit rate at first guess and 96.11% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secure search of search engine is not always secure unless HTTPS-by-default enabled everywhere.","PeriodicalId":46187,"journal":{"name":"International Journal of Security and Its Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Attacking HTTPS Secure Search Service through Correlation Analysis of HTTP Webpages Accessed\",\"authors\":\"Qian Liping, Wang Lidong\",\"doi\":\"10.14257/IJSIA.2017.11.7.03\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, a search engine can adopt HTTPS-by-default to provide bidirectional encryption to protect its users’ privacy. Since the majority of webpages indexed in search engine’s results pages are still on HTTP-enabled websites and the contents of these webpages can be observed by attackers once the user click on the indexed web-links. We propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages the user visits subsequently. We show that a simple weighted TF-DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 180 Chinese and English search phrases, we achieved 67.78% hit rate at first guess and 96.11% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secure search of search engine is not always secure unless HTTPS-by-default enabled everywhere.\",\"PeriodicalId\":46187,\"journal\":{\"name\":\"International Journal of Security and Its Applications\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Security and Its Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/IJSIA.2017.11.7.03\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Security and Its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/IJSIA.2017.11.7.03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Attacking HTTPS Secure Search Service through Correlation Analysis of HTTP Webpages Accessed
It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, a search engine can adopt HTTPS-by-default to provide bidirectional encryption to protect its users’ privacy. Since the majority of webpages indexed in search engine’s results pages are still on HTTP-enabled websites and the contents of these webpages can be observed by attackers once the user click on the indexed web-links. We propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages the user visits subsequently. We show that a simple weighted TF-DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 180 Chinese and English search phrases, we achieved 67.78% hit rate at first guess and 96.11% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secure search of search engine is not always secure unless HTTPS-by-default enabled everywhere.
期刊介绍:
IJSIA aims to facilitate and support research related to security technology and its applications. Our Journal provides a chance for academic and industry professionals to discuss recent progress in the area of security technology and its applications. Journal Topics: -Access Control -Ad Hoc & Sensor Network Security -Applied Cryptography -Authentication and Non-repudiation -Cryptographic Protocols -Denial of Service -E-Commerce Security -Identity and Trust Management -Information Hiding -Insider Threats and Countermeasures -Intrusion Detection & Prevention -Network & Wireless Security -Peer-to-Peer Security -Privacy and Anonymity -Secure installation, generation and operation -Security Analysis Methodologies -Security assurance -Security in Software Outsourcing -Security products or systems -Security technology -Systems and Data Security