揭开中间的黑暗面:对数据中心中间箱故障的实地研究

Rahul Potharaju, Navendu Jain
{"title":"揭开中间的黑暗面:对数据中心中间箱故障的实地研究","authors":"Rahul Potharaju, Navendu Jain","doi":"10.1145/2504730.2504737","DOIUrl":null,"url":null,"abstract":"Network appliances or middleboxes such as firewalls, intrusion detection and prevention systems (IDPS), load balancers, and VPNs form an integral part of datacenters and enterprise networks. Realizing their importance and shortcomings, the research community has proposed software implementations, policy-aware switching, consolidation appliances, moving middlebox processing to VMs, end hosts, and even offloading it to the cloud. While such efforts can use middlebox failure characteristics to improve their reliability, management, and cost-effectiveness, little has been reported on these failures in the field. In this paper, we make one of the first attempts to perform a large-scale empirical study of middlebox failures over two years in a service provider network comprising thousands of middleboxes across tens of datacenters. We find that middlebox failures are prevalent and they can significantly impact hosted services. Several of our findings differ in key aspects from commonly held views: (1) Most failures are grey dominated by connectivity errors and link flaps that exhibit intermittent connectivity, (2) Hardware faults and overload problems are present but they are not in majority, (3) Middleboxes experience a variety of misconfigurations such as incorrect rules, VLAN misallocation and mismatched keys, and (4) Middlebox failover is ineffective in about 33\\% of the cases for load balancers and firewalls due to configuration bugs, faulty failovers and software version mismatch. Finally, we analyze current middlebox proposals based on our study and discuss directions for future research.","PeriodicalId":155913,"journal":{"name":"Proceedings of the 2013 conference on Internet measurement conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"137","resultStr":"{\"title\":\"Demystifying the dark side of the middle: a field study of middlebox failures in datacenters\",\"authors\":\"Rahul Potharaju, Navendu Jain\",\"doi\":\"10.1145/2504730.2504737\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network appliances or middleboxes such as firewalls, intrusion detection and prevention systems (IDPS), load balancers, and VPNs form an integral part of datacenters and enterprise networks. Realizing their importance and shortcomings, the research community has proposed software implementations, policy-aware switching, consolidation appliances, moving middlebox processing to VMs, end hosts, and even offloading it to the cloud. While such efforts can use middlebox failure characteristics to improve their reliability, management, and cost-effectiveness, little has been reported on these failures in the field. In this paper, we make one of the first attempts to perform a large-scale empirical study of middlebox failures over two years in a service provider network comprising thousands of middleboxes across tens of datacenters. We find that middlebox failures are prevalent and they can significantly impact hosted services. Several of our findings differ in key aspects from commonly held views: (1) Most failures are grey dominated by connectivity errors and link flaps that exhibit intermittent connectivity, (2) Hardware faults and overload problems are present but they are not in majority, (3) Middleboxes experience a variety of misconfigurations such as incorrect rules, VLAN misallocation and mismatched keys, and (4) Middlebox failover is ineffective in about 33\\\\% of the cases for load balancers and firewalls due to configuration bugs, faulty failovers and software version mismatch. Finally, we analyze current middlebox proposals based on our study and discuss directions for future research.\",\"PeriodicalId\":155913,\"journal\":{\"name\":\"Proceedings of the 2013 conference on Internet measurement conference\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"137\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2013 conference on Internet measurement conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2504730.2504737\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2013 conference on Internet measurement conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2504730.2504737","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 137

摘要

防火墙、入侵检测和防御系统(IDPS)、负载平衡器和vpn等网络设备或中间设备构成了数据中心和企业网络的组成部分。意识到它们的重要性和缺点,研究团体提出了软件实现、策略感知交换、整合设备、将中间盒处理转移到vm、终端主机,甚至将其卸载到云。虽然这些工作可以利用中间盒故障特征来提高可靠性、管理和成本效益,但在现场很少有关于这些故障的报道。在本文中,我们首次尝试在服务提供商网络中对中间箱故障进行为期两年的大规模实证研究,该网络由数十个数据中心的数千个中间箱组成。我们发现,中间盒故障非常普遍,它们会严重影响托管服务。我们的研究结果在几个关键方面与人们普遍持有的观点不同:(1)大多数故障都是灰色的,由连接错误和链路波动主导,表现出间歇性的连接,(2)硬件故障和过载问题存在,但不是大多数,(3)Middlebox经历各种错误配置,如不正确的规则,VLAN错误分配和不匹配的密钥,以及(4)由于配置错误,故障转移和软件版本不匹配,大约33%的负载均衡器和防火墙的情况下,Middlebox故障转移是无效的。最后,基于本文的研究,分析了当前的中间盒建议,并讨论了未来的研究方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Demystifying the dark side of the middle: a field study of middlebox failures in datacenters
Network appliances or middleboxes such as firewalls, intrusion detection and prevention systems (IDPS), load balancers, and VPNs form an integral part of datacenters and enterprise networks. Realizing their importance and shortcomings, the research community has proposed software implementations, policy-aware switching, consolidation appliances, moving middlebox processing to VMs, end hosts, and even offloading it to the cloud. While such efforts can use middlebox failure characteristics to improve their reliability, management, and cost-effectiveness, little has been reported on these failures in the field. In this paper, we make one of the first attempts to perform a large-scale empirical study of middlebox failures over two years in a service provider network comprising thousands of middleboxes across tens of datacenters. We find that middlebox failures are prevalent and they can significantly impact hosted services. Several of our findings differ in key aspects from commonly held views: (1) Most failures are grey dominated by connectivity errors and link flaps that exhibit intermittent connectivity, (2) Hardware faults and overload problems are present but they are not in majority, (3) Middleboxes experience a variety of misconfigurations such as incorrect rules, VLAN misallocation and mismatched keys, and (4) Middlebox failover is ineffective in about 33\% of the cases for load balancers and firewalls due to configuration bugs, faulty failovers and software version mismatch. Finally, we analyze current middlebox proposals based on our study and discuss directions for future research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信