{"title":"Monitoring the initial DNS behavior of malicious domains","authors":"S. Hao, N. Feamster, R. Pandrangi","doi":"10.1145/2068816.2068842","DOIUrl":"https://doi.org/10.1145/2068816.2068842","url":null,"abstract":"Attackers often use URLs to advertise scams or propagate malware. Because the reputation of a domain can be used to identify malicious behavior, miscreants often register these domains \"just in time\" before an attack. This paper explores the DNS behavior of attack domains, as identified by appearance in a spam trap, shortly after the domains were registered. We explore the behavioral properties of these domains from two perspectives: (1) the DNS infrastructure associated with the domain, as is observable from the resource records; and (2) the DNS lookup patterns from networks who are looking up the domains initially. Our analysis yields many findings that may ultimately be useful for early detection of malicious domains. By monitoring the infrastructure for these malicious domains, we find that about 55% of scam domains occur in attacks at least one day after registration, suggesting the potential for early discovery of malicious domains, solely based on properties of the DNS infrastructure that resolves those domains. We also find that there are a few regions of IP address space that host name servers and other types of servers for only malicious domains. Malicious domains have resource records that are distributed more widely across IP address space, and they are more quickly looked up by a variety of different networks. We also identify a set of \"tainted\" ASes that are used heavily by bad domains to host resource records. The features we observe are often evident before any attack even takes place; ultimately, they might serve as the basis for a DNS-based early warning system for attacks.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125079345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhinav Pathak, Ming Zhang, Y. C. Hu, Ratul Mahajan, D. Maltz
{"title":"Latency inflation with MPLS-based traffic engineering","authors":"Abhinav Pathak, Ming Zhang, Y. C. Hu, Ratul Mahajan, D. Maltz","doi":"10.1145/2068816.2068859","DOIUrl":"https://doi.org/10.1145/2068816.2068859","url":null,"abstract":"While MPLS has been extensively deployed in recent years, little is known about its behavior in practice. We examine the performance of MPLS in Microsoft's online service network (MSN), a well-provisioned multi-continent production network connecting tens of data centers. Using detailed traces collected over a 2-month period, we find that many paths experience significantly inflated latencies. We correlate occurrences of latency inflation with routers, links, and DC-pairs. This analysis sheds light on the causes of latency inflation and suggests several avenues for alleviating the problem.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"31 19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126812374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yabing Liu, K. Gummadi, B. Krishnamurthy, A. Mislove
{"title":"Analyzing facebook privacy settings: user expectations vs. reality","authors":"Yabing Liu, K. Gummadi, B. Krishnamurthy, A. Mislove","doi":"10.1145/2068816.2068823","DOIUrl":"https://doi.org/10.1145/2068816.2068823","url":null,"abstract":"The sharing of personal data has emerged as a popular activity over online social networking sites like Facebook. As a result, the issue of online social network privacy has received significant attention in both the research literature and the mainstream media. Our overarching goal is to improve defaults and provide better tools for managing privacy, but we are limited by the fact that the full extent of the privacy problem remains unknown; there is little quantification of the incidence of incorrect privacy settings or the difficulty users face when managing their privacy.\u0000 In this paper, we focus on measuring the disparity between the desired and actual privacy settings, quantifying the magnitude of the problem of managing privacy. We deploy a survey, implemented as a Facebook application, to 200 Facebook users recruited via Amazon Mechanical Turk. We find that 36% of content remains shared with the default privacy settings. We also find that, overall, privacy settings match users' expectations only 37% of the time, and when incorrect, almost always expose content to more users than expected. Finally, we explore how our results have potential to assist users in selecting appropriate privacy settings by examining the user-created friend lists. We find that these have significant correlation with the social network, suggesting that information from the social network may be helpful in implementing new tools for managing privacy.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127872421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, Yafei Dai
{"title":"Uncovering social network sybils in the wild","authors":"Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, Yafei Dai","doi":"10.1145/2068816.2068841","DOIUrl":"https://doi.org/10.1145/2068816.2068841","url":null,"abstract":"Sybil accounts are fake identities created to unfairly increase the power or resources of a single user. Researchers have long known about the existence of Sybil accounts in online communities such as file-sharing systems, but have not been able to perform large scale measurements to detect them or measure their activities. In this paper, we describe our efforts to detect, characterize and understand Sybil account activity in the Renren online social network (OSN). We use ground truth provided by Renren Inc. to build measurement based Sybil account detectors, and deploy them on Renren to detect over 100,000 Sybil accounts. We study these Sybil accounts, as well as an additional 560,000 Sybil accounts caught by Renren, and analyze their link creation behavior. Most interestingly, we find that contrary to prior conjecture, Sybil accounts in OSNs do not form tight-knit communities. Instead, they integrate into the social graph just like normal users. Using link creation timestamps, we verify that the large majority of links between Sybil accounts are created accidentally, unbeknownst to the attacker. Overall, only a very small portion of Sybil accounts are connected to other Sybils with social links. Our study shows that existing Sybil defenses are unlikely to succeed in today's OSNs, and we must design new techniques to effectively detect and defend against Sybil attacks.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"257 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127540355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, Ben Y. Zhao
{"title":"Sharing graphs using differentially private graph models","authors":"A. Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, Ben Y. Zhao","doi":"10.1145/2068816.2068825","DOIUrl":"https://doi.org/10.1145/2068816.2068825","url":null,"abstract":"Continuing success of research on social and computer networks requires open access to realistic measurement datasets. While these datasets can be shared, generally in the form of social or Internet graphs, doing so often risks exposing sensitive user data to the public. Unfortunately, current techniques to improve privacy on graphs only target specific attacks, and have been proven to be vulnerable against powerful de-anonymization attacks.\u0000 Our work seeks a solution to share meaningful graph datasets while preserving privacy. We observe a clear tension between strength of privacy protection and maintaining structural similarity to the original graph. To navigate the tradeoff, we develop a differentially-private graph model we call Pygmalion. Given a graph G and a desired level of e-differential privacy guarantee, Pygmalion extracts a graph's detailed structure into degree correlation statistics, introduces noise into the resulting dataset, and generates a synthetic graph G'. G' maintains as much structural similarity to G as possible, while introducing enough differences to provide the desired privacy guarantee. We show that simply applying differential privacy to graphs results in the addition of significant noise that may disrupt graph structure, making it unsuitable for experimental study. Instead, we introduce a partitioning approach that provides identical privacy guarantees using much less noise. Applied to real graphs, this technique requires an order of magnitude less noise for the same privacy guarantees. Finally, we apply our graph model to Internet, web, and Facebook social graphs, and show that it produces synthetic graphs that closely match the originals in both graph structure metrics and behavior in application-level tests.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128775999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Counting YouTube videos via random prefix sampling","authors":"Jia Zhou, Yanhua Li, Vijay Kumar Adhikari, Zhi-Li Zhang","doi":"10.1145/2068816.2068851","DOIUrl":"https://doi.org/10.1145/2068816.2068851","url":null,"abstract":"Leveraging the characteristics of YouTube video id space and exploiting a unique property of YouTube search API, in this paper we develop a random prefix sampling method to estimate the total number of videos hosted by YouTube. Through theoretical modeling and analysis, we demonstrate that the estimator based on this method is unbiased, and provide bounds on its variance and confidence interval. These bounds enable us to judiciously select sample sizes to control estimation errors. We evaluate our sampling method and validate the sampling results using two distinct collections of YouTube video id's (namely, treating each collection as if it were the \"true\" collection of YouTube videos). We then apply our sampling method to the live YouTube system, and estimate that there are a total of roughly 500 millions YouTube videos by May, 2011. Finally, using an unbiased collection of YouTube videos sampled by our method, we show that YouTube video view count statistics collected by prior methods (e.g., through crawling of related video links) are highly skewed, significantly under-estimating the number of videos with very small view counts (<1000); we also shed lights on the bounds for the total storage YouTube must have and the network capacity needed to delivery YouTube videos.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"8 Suppl 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128845084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding website complexity: measurements, metrics, and implications","authors":"Michael Butkiewicz, H. Madhyastha, V. Sekar","doi":"10.1145/2068816.2068846","DOIUrl":"https://doi.org/10.1145/2068816.2068846","url":null,"abstract":"Over the years, the web has evolved from simple text content from one server to a complex ecosystem with different types of content from servers spread across several administrative domains. There is anecdotal evidence of users being frustrated with high page load times or when obscure scripts cause their browser windows to freeze. Because page load times are known to directly impact user satisfaction, providers would like to understand if and how the complexity of their websites affects the user experience.\u0000 While there is an extensive literature on measuring web graphs, website popularity, and the nature of web traffic, there has been little work in understanding how complex individual websites are, and how this complexity impacts the clients' experience. This paper is a first step to address this gap. To this end, we identify a set of metrics to characterize the complexity of websites both at a content-level (e.g., number and size of images) and service-level (e.g., number of servers/origins).\u0000 We find that the distributions of these metrics are largely independent of a website's popularity rank. However, some categories (e.g., News) are more complex than others. More than 60% of websites have content from at least 5 non-origin sources and these contribute more than 35% of the bytes downloaded. In addition, we analyze which metrics are most critical for predicting page render and load times and find that the number of objects requested is the most important factor. With respect to variability in load times, however, we find that the number of servers is the best indicator.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132068398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Suspended accounts in retrospect: an analysis of twitter spam","authors":"Kurt Thomas, Chris Grier, D. Song, V. Paxson","doi":"10.1145/2068816.2068840","DOIUrl":"https://doi.org/10.1145/2068816.2068840","url":null,"abstract":"In this study, we examine the abuse of online social networks at the hands of spammers through the lens of the tools, techniques, and support infrastructure they rely upon. To perform our analysis, we identify over 1.1 million accounts suspended by Twitter for disruptive activities over the course of seven months. In the process, we collect a dataset of 1.8 billion tweets, 80 million of which belong to spam accounts. We use our dataset to characterize the behavior and lifetime of spam accounts, the campaigns they execute, and the wide-spread abuse of legitimate web services such as URL shorteners and free web hosting. We also identify an emerging marketplace of illegitimate programs operated by spammers that include Twitter account sellers, ad-based URL shorteners, and spam affiliate programs that help enable underground market diversification.\u0000 Our results show that 77% of spam accounts identified by Twitter are suspended within on day of their first tweet. Because of these pressures, less than 9% of accounts form social relationships with regular Twitter users. Instead, 17% of accounts rely on hijacking trends, while 52% of accounts use unsolicited mentions to reach an audience. In spite of daily account attrition, we show how five spam campaigns controlling 145 thousand accounts combined are able to persist for months at a time, with each campaign enacting a unique spamming strategy. Surprisingly, three of these campaigns send spam directing visitors to reputable store fronts, blurring the line regarding what constitutes spam on social networks.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130651094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Song, Zihui Ge, A. Mahimkar, Jia Wang, J. Yates, Yin Zhang, A. Basso, Min Chen
{"title":"Q-score: proactive service quality assessment in a large IPTV system","authors":"H. Song, Zihui Ge, A. Mahimkar, Jia Wang, J. Yates, Yin Zhang, A. Basso, Min Chen","doi":"10.1145/2068816.2068836","DOIUrl":"https://doi.org/10.1145/2068816.2068836","url":null,"abstract":"In large-scale IPTV systems, it is essential to maintain high service quality while providing a wider variety of service features than typical traditional TV. Thus service quality assessment systems are of paramount importance as they monitor the user-perceived service quality and alert when issues occurs. For IPTV systems, however, there is no simple metric to represent user-perceived service quality and Quality of Experience (QoE). Moreover, there is only limited user feedback, often in the form of noisy and delayed customer calls. Therefore, we aim to approximate the QoE through a selected set of performance indicators in a proactive (i.e., detect issues before customers reports to call centers) and scalable fashion.\u0000 In this paper, we present a service quality assessment framework, Q-score, which accurately learns a small set of performance indicators most relevant to user-perceived service quality, and proactively infers service quality in a single score. We evaluate Q-score using network data collected from a commercial IPTV service provider and show that Q-score is able to predict 60% of the service problems that are reported by customers with 0.1% false positives. Through Q-score, we have (i) gained insight into various types of service problems causing user dissatisfaction, including why users tend to react promptly to sound issues while late to video issues; (ii) identified and quantified the opportunity to proactively detect the service quality degradation of individual customers before severe performance impact occurs; and (iii) observed possibility to allocate customer care workforce to potentially troubling service areas before issues break out.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122139839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying diverse usage behaviors of smartphone apps","authors":"Qiang Xu, Jeffrey Erman, Alexandre Gerber, Z. Morley Mao, Jeffrey Pang, Shobha Venkataraman","doi":"10.1145/2068816.2068847","DOIUrl":"https://doi.org/10.1145/2068816.2068847","url":null,"abstract":"Smartphone users are increasingly shifting to using apps as \"gateways\" to Internet services rather than traditional web browsers. App marketplaces for iOS, Android, and Windows Phone platforms have made it attractive for developers to deploy apps and easy for users to discover and start using many network-enabled apps quickly. For example, it was recently reported that the iOS AppStore has more than 350K apps and more than 10 billion downloads. Furthermore, the appearance of tablets and mobile devices with other form factors, which also use these marketplaces, has increased the diversity in apps and their user population. Despite the increasing importance of apps as gateways to network services, we have a much sparser understanding of how, where, and when they are used compared to traditional web services, particularly at scale. This paper takes a first step in addressing this knowledge gap by presenting results on app usage at a national level using anonymized network measurements from a tier-1 cellular carrier in the U.S. We identify traffic from distinct marketplace apps based on HTTP signatures and present aggregate results on their spatial and temporal prevalence, locality, and correlation.","PeriodicalId":287661,"journal":{"name":"ACM/SIGCOMM Internet Measurement Conference","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127615371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}