Mike P. Wittie, V. Pejović, Lara B. Deek, K. Almeroth, Ben Y. Zhao
{"title":"Exploiting locality of interest in online social networks","authors":"Mike P. Wittie, V. Pejović, Lara B. Deek, K. Almeroth, Ben Y. Zhao","doi":"10.1145/1921168.1921201","DOIUrl":"https://doi.org/10.1145/1921168.1921201","url":null,"abstract":"Online Social Networks (OSN) are fun, popular, and socially significant. An integral part of their success is the immense size of their global user base. To provide a consistent service to all users, Facebook, the world's largest OSN, is heavily dependent on centralized U.S. data centers, which renders service outside of the U.S. sluggish and wasteful of Internet bandwidth. In this paper, we investigate the detailed causes of these two problems and identify mitigation opportunities. Because details of Facebook's service remain proprietary, we treat the OSN as a black box and reverse engineer its operation from publicly available traces. We find that contrary to current wisdom, OSN state is amenable to partitioning and that its fine grained distribution and processing can significantly improve performance without loss in service consistency. Through simulations of reconstructed Facebook traffic over measured Internet paths, we show that user requests can be processed 79% faster and use 91% less bandwidth. We conclude that the partitioning of OSN state is an attractive scaling strategy for Facebook and other OSN services.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90325897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuanxiong Guo, Guohan Lu, Helen J. Wang, Shuang Yang, Chao Kong, Peng Sun, Wenfei Wu, Yongguang Zhang
{"title":"SecondNet: a data center network virtualization architecture with bandwidth guarantees","authors":"Chuanxiong Guo, Guohan Lu, Helen J. Wang, Shuang Yang, Chao Kong, Peng Sun, Wenfei Wu, Yongguang Zhang","doi":"10.1145/1921168.1921188","DOIUrl":"https://doi.org/10.1145/1921168.1921188","url":null,"abstract":"In this paper, we propose virtual data center (VDC) as the unit of resource allocation for multiple tenants in the cloud. VDCs are more desirable than physical data centers because the resources allocated to VDCs can be rapidly adjusted as tenants' needs change. To enable the VDC abstraction, we design a data center network virtualization architecture called SecondNet. SecondNet achieves scalability by distributing all the virtual-to-physical mapping, routing, and bandwidth reservation state in server hypervisors. Its port-switching based source routing (PSSR) further makes SecondNet applicable to arbitrary network topologies using commodity servers and switches. SecondNet introduces a centralized VDC allocation algorithm for bandwidth guaranteed virtual to physical mapping. Simulations demonstrate that our VDC allocation achieves high network utilization and low time complexity. Our implementation and experiments show that we can build SecondNet on top of various network topologies, and SecondNet provides bandwidth guarantee and elasticity, as designed.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72902532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Y. Mun, Shuai Hao, Nilesh Mishra, Katie Shilton, J. Burke, D. Estrin, Mark H. Hansen, R. Govindan
{"title":"Personal data vaults: a locus of control for personal data streams","authors":"Min Y. Mun, Shuai Hao, Nilesh Mishra, Katie Shilton, J. Burke, D. Estrin, Mark H. Hansen, R. Govindan","doi":"10.1145/1921168.1921191","DOIUrl":"https://doi.org/10.1145/1921168.1921191","url":null,"abstract":"The increasing ubiquity of the mobile phone is creating many opportunities for personal context sensing, and will result in massive databases of individuals' sensitive information incorporating locations, movements, images, text annotations, and even health data. In existing system architectures, users upload their raw (unprocessed or filtered) data streams directly to content-service providers and have little control over their data once they \"opt-in\". We present Personal Data Vaults (PDVs), a privacy architecture in which individuals retain ownership of their data. Data are routinely filtered before being shared with content-service providers, and users or data custodian services can participate in making controlled data-sharing decisions. Introducing a PDV gives users flexible and granular access control over data. To reduce the burden on users and improve usability, we explore three mechanisms for managing data policies: Granular ACL, Trace-audit and Rule Recommender. We have implemented a proof-of-concept PDV and evaluated it using real data traces collected from two personal participatory sensing applications.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85122975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Popa, S. Ratnasamy, G. Iannaccone, A. Krishnamurthy, I. Stoica
{"title":"A cost comparison of datacenter network architectures","authors":"L. Popa, S. Ratnasamy, G. Iannaccone, A. Krishnamurthy, I. Stoica","doi":"10.1145/1921168.1921189","DOIUrl":"https://doi.org/10.1145/1921168.1921189","url":null,"abstract":"There is a growing body of research exploring new network architectures for the data center. These proposals all seek to improve the scalability and cost-effectiveness of current data center networks, but adopt very different approaches to doing so. For example, some proposals build networks entirely out of switches while others do so using a combination of switches and servers. How do these different network architectures compare? For that matter, by what metrics should we even begin to compare these architectures? Understanding the tradeoffs between different approaches is important both for operators making deployment decisions and to guide future research. In this paper, we take a first step toward understanding the tradeoffs between different data center network architectures. We use high-level models of different classes of data center networks and compare them on cost using both current and predicted trends in cost and power consumption.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84210936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Internet is flat: modeling the transition from a transit hierarchy to a peering mesh","authors":"A. Dhamdhere, C. Dovrolis","doi":"10.1145/1921168.1921196","DOIUrl":"https://doi.org/10.1145/1921168.1921196","url":null,"abstract":"Recent measurements and anecdotal evidence indicate that the Internet ecosystem is rapidly evolving from a multi-tier hierarchy built mostly with transit (customer-provider) links to a dense mesh formed with mostly peering links. This transition can have major impact on the global Internet economy as well as on the traffic flow and topological structure of the Internet. In this paper, we study this evolutionary transition with an agent-based network formation model that captures key aspects of the interdomain ecosystem, viz., interdomain traffic flow and routing, provider and peer selection strategies, geographical constraints, and the economics of transit and peering interconnections. The model predicts several substantial differences between the Hierarchical Internet and the Flat Internet in terms of topological structure, path lengths, interdomain traffic flow, and the profitability of transit providers. We also quantify the effect of the three factors driving this evolutionary transition. Finally, we examine a hypothetical scenario in which a large content provider produces more than half of the total Internet traffic.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84318554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yeon-sup Lim, Hyunchul Kim, Jiwoong Jeong, Chong-kwon Kim, T. Kwon, Yanghee Choi
{"title":"Internet traffic classification demystified: on the sources of the discriminative power","authors":"Yeon-sup Lim, Hyunchul Kim, Jiwoong Jeong, Chong-kwon Kim, T. Kwon, Yanghee Choi","doi":"10.1145/1921168.1921180","DOIUrl":"https://doi.org/10.1145/1921168.1921180","url":null,"abstract":"Recent research on Internet traffic classification has yield a number of data mining techniques for distinguishing types of traffic, but no systematic analysis on \"Why\" some algorithms achieve high accuracies. In pursuit of empirically grounded answers to the \"Why\" question, which is critical in understanding and establishing a scientific ground for traffic classification research, this paper reveals the three sources of the discriminative power in classifying the Internet application traffic: (i) ports, (ii) the sizes of the first one-two (for UDP flows) or four-five (for TCP flows) packets, and (iii) discretization of those features. We find that C4.5 performs the best under any circumstances, as well as the reason why; because the algorithm discretizes input features during classification operations. We also find that the entropy-based Minimum Description Length discretization on ports and packet size features substantially improve the classification accuracy of every machine learning algorithm tested (by as much as 59.8%!) and make all of them achieve >93% accuracy on average without any algorithm-specific tuning processes. Our results indicate that dealing with the ports and packet size features as discrete nominal intervals, not as continuous numbers, is the essential basis for accurate traffic classification (i.e., the features should be discretized first), regardless of classification algorithms to use.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76021333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Bharti, P. Kankar, L. Setia, Gonca Gürsun, Anukool Lakhina, M. Crovella
{"title":"Inferring invisible traffic","authors":"V. Bharti, P. Kankar, L. Setia, Gonca Gürsun, Anukool Lakhina, M. Crovella","doi":"10.1145/1921168.1921197","DOIUrl":"https://doi.org/10.1145/1921168.1921197","url":null,"abstract":"A traffic matrix encompassing the entire Internet would be very valuable. Unfortunately, from any given vantage point in the network, most traffic is invisible. In this paper we describe results that hold some promise for this problem. First, we show a new characterization result: traffic matrices (TMs) typically show very low effective rank. This result refers to TMs that are purely spatial (have no temporal component), over a wide range of spatial granularities. Next, we define an inference problem whose solution allows one to infer invisible TM elements. This problem relies crucially on an atomicity property we define. Finally, we show example solutions of this inference problem via two different methods: regularized regression and matrix completion. The example consists of an AS inferring the amount of invisible traffic passing between other pairs of ASes. Using this example we illustrate the accuracy of the methods as a function of spatial granularity.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78545673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ankit Singla, Brighten Godfrey, K. Fall, G. Iannaccone, S. Ratnasamy
{"title":"Scalable routing on flat names","authors":"Ankit Singla, Brighten Godfrey, K. Fall, G. Iannaccone, S. Ratnasamy","doi":"10.1145/1921168.1921195","DOIUrl":"https://doi.org/10.1145/1921168.1921195","url":null,"abstract":"We introduce a protocol which routes on flat, location-independent identifiers with guaranteed scalability and low stretch. Our design builds on theoretical advances in the area of compact routing, and is the first to realize these guarantees in a dynamic distributed setting.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80148004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatio-temporal patterns in network events","authors":"Ting Wang, M. Srivatsa, D. Agrawal, Ling Liu","doi":"10.1145/1921168.1921172","DOIUrl":"https://doi.org/10.1145/1921168.1921172","url":null,"abstract":"Operational networks typically generate massive monitoring data that consist of local (in both space and time) observations of the status of the networks. It is often hypothesized that such data exhibit both spatial and temporal correlation based on the underlying network topology and time of occurrence; identifying such correlation patterns offers valuable insights into global network phenomena (e.g., fault cascading in communication networks). In this paper we introduce a new class of models suitable for learning, indexing, and identifying spatio-temporal patterns in network monitoring data. We exemplify our techniques with the application of fault diagnosis in enterprise networks. We show how it can help network management systems (NMSes) to effciently detect and localize potential faults (e.g., failure of routing protocols or network equipments) by analyzing massive operational event streams (e.g., alerts, alarms, and metrics). We provide results from extensive experimental studies over real network event and topology datasets to explore the effcacy of our solution.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77823279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Upendra Shevade, Yi-Chao Chen, L. Qiu, Yin Zhang, V. Chandar, M. Han, H. Song, Yousuk Seung
{"title":"Enabling high-bandwidth vehicular content distribution","authors":"Upendra Shevade, Yi-Chao Chen, L. Qiu, Yin Zhang, V. Chandar, M. Han, H. Song, Yousuk Seung","doi":"10.1145/1921168.1921199","DOIUrl":"https://doi.org/10.1145/1921168.1921199","url":null,"abstract":"We present VCD, a novel system for enabling high-bandwidth content distribution in vehicular networks. In VCD, a vehicle opportunistically communicates with nearby access points (APs) to download the content of interest. To fully take advantage of such transient contact with APs, we proactively push content to the APs that the vehicles will likely visit in the near future. In this way, vehicles can enjoy the full wireless capacity instead of being bottle-necked by the Internet connectivity, which is either slow or even unavailable. We develop a new algorithm for predicting the APs that will soon be visited by the vehicles. We then develop a replication scheme that leverages the synergy among (i) Internet connectivity (which is persistent but has limited coverage and low bandwidth), (ii) local wireless connectivity (which has high bandwidth but transient duration), (iii) vehicular relay connectivity (which has high bandwidth but high delay), and (iv) mesh connectivity among APs (which has high bandwidth but low coverage). We demonstrate the effectiveness of VCD system using trace-driven simulation and Emulab emulation based on real taxi traces. We further deploy VCD in two vehicular networks: one using 802.11b and the other using 802.11n, to demonstrate its effectiveness.","PeriodicalId":20688,"journal":{"name":"Proceedings of The 6th International Conference on Innovation in Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90120054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}