vFFR:在带可编程数据平面的设备中实施极快故障恢复策略

IF 6.3 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
David Franco;Marivi Higuero;Ane Sanz;Juanjo Unzilla;Maider Huarte
{"title":"vFFR:在带可编程数据平面的设备中实施极快故障恢复策略","authors":"David Franco;Marivi Higuero;Ane Sanz;Juanjo Unzilla;Maider Huarte","doi":"10.1109/OJCOMS.2024.3493417","DOIUrl":null,"url":null,"abstract":"The rapid emergence of new applications and services, and their increased demand for Quality of Service (QoS), have a significant impact on the development of today’s communication networks. As a result, communication networks are constantly evolving towards new architectures, such as the 6th Generation (6G) of communication systems, currently being studied in academic and research environments. One of the most critical aspects of designing communication networks is meeting the restricted delay and packet loss requirements. In this context, although link failure recovery has been widely addressed in the literature, it remains one of the main causes of packet losses and delays in the network. The failure recovery time in currently deployed technologies is still far from the sub-millisecond delay required in 6G networks. The time required for distributed network architectures to converge to a common network state after a link failure is excessive. In contrast, centralized architectures such as Software-Defined Networking (SDN) solve this problem but still need to notify the failure to a centralized controller, which increases the recovery time. This paper proposes a very Fast Failure Recovery (vFFR) strategy that can recover from link failures in sub-millisecond timescales by reacting directly from the data plane of the network devices while maintaining a synchronized state with the centralized controller. We first analyze current failure recovery strategies and classify them according to the techniques used to optimize failure recovery time. Afterward, we describe the design of a vFFR strategy that combines three data plane recovery algorithms to reduce latency and packet loss under varying network conditions. Our vFFR strategy has been modeled in P4 language and tested on an emulation platform to validate the three data plane recovery algorithms under different conditions. The results show that latency varies according to the alternate path selected in the recovery algorithm, and the packet loss rate remains constant even when the background traffic reaches 90% of the link capacity. In addition, the vFFR strategy has been implemented on Intel Tofino devices, achieving a failure recovery time lower than \n<inline-formula> <tex-math>$500~\\mu s$ </tex-math></inline-formula>\n and a total frame loss rate below 0.005% in all cases, including those with a 35 Gbps load.","PeriodicalId":33803,"journal":{"name":"IEEE Open Journal of the Communications Society","volume":"5 ","pages":"7121-7146"},"PeriodicalIF":6.3000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10746495","citationCount":"0","resultStr":"{\"title\":\"vFFR: A Very Fast Failure Recovery Strategy Implemented in Devices With Programmable Data Plane\",\"authors\":\"David Franco;Marivi Higuero;Ane Sanz;Juanjo Unzilla;Maider Huarte\",\"doi\":\"10.1109/OJCOMS.2024.3493417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid emergence of new applications and services, and their increased demand for Quality of Service (QoS), have a significant impact on the development of today’s communication networks. As a result, communication networks are constantly evolving towards new architectures, such as the 6th Generation (6G) of communication systems, currently being studied in academic and research environments. One of the most critical aspects of designing communication networks is meeting the restricted delay and packet loss requirements. In this context, although link failure recovery has been widely addressed in the literature, it remains one of the main causes of packet losses and delays in the network. The failure recovery time in currently deployed technologies is still far from the sub-millisecond delay required in 6G networks. The time required for distributed network architectures to converge to a common network state after a link failure is excessive. In contrast, centralized architectures such as Software-Defined Networking (SDN) solve this problem but still need to notify the failure to a centralized controller, which increases the recovery time. This paper proposes a very Fast Failure Recovery (vFFR) strategy that can recover from link failures in sub-millisecond timescales by reacting directly from the data plane of the network devices while maintaining a synchronized state with the centralized controller. We first analyze current failure recovery strategies and classify them according to the techniques used to optimize failure recovery time. Afterward, we describe the design of a vFFR strategy that combines three data plane recovery algorithms to reduce latency and packet loss under varying network conditions. Our vFFR strategy has been modeled in P4 language and tested on an emulation platform to validate the three data plane recovery algorithms under different conditions. The results show that latency varies according to the alternate path selected in the recovery algorithm, and the packet loss rate remains constant even when the background traffic reaches 90% of the link capacity. In addition, the vFFR strategy has been implemented on Intel Tofino devices, achieving a failure recovery time lower than \\n<inline-formula> <tex-math>$500~\\\\mu s$ </tex-math></inline-formula>\\n and a total frame loss rate below 0.005% in all cases, including those with a 35 Gbps load.\",\"PeriodicalId\":33803,\"journal\":{\"name\":\"IEEE Open Journal of the Communications Society\",\"volume\":\"5 \",\"pages\":\"7121-7146\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2024-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10746495\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Communications Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10746495/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10746495/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

新应用和新服务的迅速出现及其对服务质量(QoS)要求的提高,对当今通信网络的发展产生了重大影响。因此,通信网络正不断向新的架构发展,例如学术和研究环境中正在研究的第六代(6G)通信系统。设计通信网络最关键的一个方面是满足限制延迟和数据包丢失的要求。在这种情况下,尽管链路故障恢复在文献中已得到广泛讨论,但它仍然是造成网络数据包丢失和延迟的主要原因之一。目前部署的技术的故障恢复时间与 6G 网络所需的亚毫秒级延迟还相差甚远。分布式网络架构在链路故障后收敛到共同网络状态所需的时间过长。相比之下,软件定义网络(SDN)等集中式架构虽然解决了这一问题,但仍需将故障通知给集中式控制器,从而增加了恢复时间。本文提出了一种超快故障恢复(vFFR)策略,通过直接从网络设备的数据平面做出反应,同时与集中控制器保持同步状态,可以在亚毫秒级的时间尺度内从链路故障中恢复。我们首先分析了当前的故障恢复策略,并根据用于优化故障恢复时间的技术对其进行了分类。随后,我们介绍了 vFFR 策略的设计,该策略结合了三种数据平面恢复算法,可在不同网络条件下减少延迟和数据包丢失。我们的 vFFR 策略使用 P4 语言建模,并在仿真平台上进行了测试,以验证三种数据平面恢复算法在不同条件下的有效性。结果表明,延迟随恢复算法中选择的备用路径而变化,即使背景流量达到链路容量的 90%,丢包率也保持不变。此外,vFFR策略已在英特尔Tofino设备上实现,在所有情况下,包括负载为35 Gbps的情况下,故障恢复时间均低于500~\mu s$,总丢帧率低于0.005%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
vFFR: A Very Fast Failure Recovery Strategy Implemented in Devices With Programmable Data Plane
The rapid emergence of new applications and services, and their increased demand for Quality of Service (QoS), have a significant impact on the development of today’s communication networks. As a result, communication networks are constantly evolving towards new architectures, such as the 6th Generation (6G) of communication systems, currently being studied in academic and research environments. One of the most critical aspects of designing communication networks is meeting the restricted delay and packet loss requirements. In this context, although link failure recovery has been widely addressed in the literature, it remains one of the main causes of packet losses and delays in the network. The failure recovery time in currently deployed technologies is still far from the sub-millisecond delay required in 6G networks. The time required for distributed network architectures to converge to a common network state after a link failure is excessive. In contrast, centralized architectures such as Software-Defined Networking (SDN) solve this problem but still need to notify the failure to a centralized controller, which increases the recovery time. This paper proposes a very Fast Failure Recovery (vFFR) strategy that can recover from link failures in sub-millisecond timescales by reacting directly from the data plane of the network devices while maintaining a synchronized state with the centralized controller. We first analyze current failure recovery strategies and classify them according to the techniques used to optimize failure recovery time. Afterward, we describe the design of a vFFR strategy that combines three data plane recovery algorithms to reduce latency and packet loss under varying network conditions. Our vFFR strategy has been modeled in P4 language and tested on an emulation platform to validate the three data plane recovery algorithms under different conditions. The results show that latency varies according to the alternate path selected in the recovery algorithm, and the packet loss rate remains constant even when the background traffic reaches 90% of the link capacity. In addition, the vFFR strategy has been implemented on Intel Tofino devices, achieving a failure recovery time lower than $500~\mu s$ and a total frame loss rate below 0.005% in all cases, including those with a 35 Gbps load.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
13.70
自引率
3.80%
发文量
94
审稿时长
10 weeks
期刊介绍: The IEEE Open Journal of the Communications Society (OJ-COMS) is an open access, all-electronic journal that publishes original high-quality manuscripts on advances in the state of the art of telecommunications systems and networks. The papers in IEEE OJ-COMS are included in Scopus. Submissions reporting new theoretical findings (including novel methods, concepts, and studies) and practical contributions (including experiments and development of prototypes) are welcome. Additionally, survey and tutorial articles are considered. The IEEE OJCOMS received its debut impact factor of 7.9 according to the Journal Citation Reports (JCR) 2023. The IEEE Open Journal of the Communications Society covers science, technology, applications and standards for information organization, collection and transfer using electronic, optical and wireless channels and networks. Some specific areas covered include: Systems and network architecture, control and management Protocols, software, and middleware Quality of service, reliability, and security Modulation, detection, coding, and signaling Switching and routing Mobile and portable communications Terminals and other end-user devices Networks for content distribution and distributed computing Communications-based distributed resources control.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信