Ravana: controller fault-tolerance in software-defined networking

N. Katta, Haoyu Zhang, M. Freedman, J. Rexford
{"title":"Ravana: controller fault-tolerance in software-defined networking","authors":"N. Katta, Haoyu Zhang, M. Freedman, J. Rexford","doi":"10.1145/2774993.2774996","DOIUrl":null,"url":null,"abstract":"Software-defined networking (SDN) offers greater flexibility than traditional distributed architectures, at the risk of the controller being a single point-of-failure. Unfortunately, existing fault-tolerance techniques, such as replicated state machine, are insufficient to ensure correct network behavior under controller failures. The challenge is that, in addition to the application state of the controllers, the switches maintain hard state that must be handled consistently. Thus, it is necessary to incorporate switch state into the system model to correctly offer a \"logically centralized\" controller. We introduce Ravana, a fault-tolerant SDN controller platform that processes the control messages transactionally and exactly once (at both the controllers and the switches). Ravana maintains these guarantees in the face of both controller and switch crashes. The key insight in Ravana is that replicated state machines can be extended with lightweight switch-side mechanisms to guarantee correctness, without involving the switches in an elaborate consensus protocol. Our prototype implementation of Ravana enables unmodified controller applications to execute in a fault-tolerant fashion. Experiments show that Ravana achieves high throughput with reasonable overhead, compared to a single controller, with a failover time under 100ms.","PeriodicalId":316190,"journal":{"name":"Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"153","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined Networking Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2774993.2774996","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 153

Abstract

Software-defined networking (SDN) offers greater flexibility than traditional distributed architectures, at the risk of the controller being a single point-of-failure. Unfortunately, existing fault-tolerance techniques, such as replicated state machine, are insufficient to ensure correct network behavior under controller failures. The challenge is that, in addition to the application state of the controllers, the switches maintain hard state that must be handled consistently. Thus, it is necessary to incorporate switch state into the system model to correctly offer a "logically centralized" controller. We introduce Ravana, a fault-tolerant SDN controller platform that processes the control messages transactionally and exactly once (at both the controllers and the switches). Ravana maintains these guarantees in the face of both controller and switch crashes. The key insight in Ravana is that replicated state machines can be extended with lightweight switch-side mechanisms to guarantee correctness, without involving the switches in an elaborate consensus protocol. Our prototype implementation of Ravana enables unmodified controller applications to execute in a fault-tolerant fashion. Experiments show that Ravana achieves high throughput with reasonable overhead, compared to a single controller, with a failover time under 100ms.
软件定义网络中的控制器容错
软件定义网络(SDN)提供了比传统分布式体系结构更大的灵活性,但存在控制器成为单点故障的风险。不幸的是,现有的容错技术,如复制状态机,不足以确保控制器故障时正确的网络行为。挑战在于,除了控制器的应用程序状态外,开关还保持必须一致处理的硬状态。因此,有必要将开关状态合并到系统模型中,以正确地提供“逻辑集中”的控制器。我们介绍了Ravana,这是一个容错SDN控制器平台,它以事务方式处理控制消息,并且只处理一次(在控制器和交换机上)。在控制器和交换机崩溃的情况下,Ravana保持这些保证。Ravana的关键观点是,复制的状态机可以用轻量级的开关端机制进行扩展,以保证正确性,而无需将开关涉及到复杂的共识协议中。我们的Ravana原型实现使未经修改的控制器应用程序能够以容错方式执行。实验表明,与单个控制器相比,Ravana在合理的开销下实现了高吞吐量,故障转移时间低于100ms。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信