Resilient overlay networks

D. Andersen, H. Balakrishnan, F. Kaashoek, R. Morris
{"title":"Resilient overlay networks","authors":"D. Andersen, H. Balakrishnan, F. Kaashoek, R. Morris","doi":"10.1145/502034.502048","DOIUrl":null,"url":null,"abstract":"A Resilient Overlay Network (RON) is an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance within several seconds, improving over today's wide-area routing protocols that take at least several minutes to recover. A RON is an application-layer overlay on top of the existing Internet routing substrate. The RON nodes monitor the functioning and quality of the Internet paths among themselves, and use this information to decide whether to route packets directly over the Internet or by way of other RON nodes, optimizing application-specific routing metrics.Results from two sets of measurements of a working RON deployed at sites scattered across the Internet demonstrate the benefits of our architecture. For instance, over a 64-hour sampling period in March 2001 across a twelve-node RON, there were 32 significant outages, each lasting over thirty minutes, over the 132 measured paths. RON's routing mechanism was able to detect, recover, and route around all of them, in less than twenty seconds on average, showing that its methods for fault detection and recovery work well at discovering alternate paths in the Internet. Furthermore, RON was able to improve the loss rate, latency, or throughput perceived by data transfers; for example, about 5% of the transfers doubled their TCP throughput and 5% of our transfers saw their loss probability reduced by 0.05. We found that forwarding packets via at most one intermediate RON node is sufficient to overcome faults and improve performance in most cases. These improvements, particularly in the area of fault detection and recovery, demonstrate the benefits of moving some of the control over routing into the hands of end-systems.","PeriodicalId":263344,"journal":{"name":"Proceedings of the eighteenth ACM symposium on Operating systems principles","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"409","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the eighteenth ACM symposium on Operating systems principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/502034.502048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 409

Abstract

A Resilient Overlay Network (RON) is an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance within several seconds, improving over today's wide-area routing protocols that take at least several minutes to recover. A RON is an application-layer overlay on top of the existing Internet routing substrate. The RON nodes monitor the functioning and quality of the Internet paths among themselves, and use this information to decide whether to route packets directly over the Internet or by way of other RON nodes, optimizing application-specific routing metrics.Results from two sets of measurements of a working RON deployed at sites scattered across the Internet demonstrate the benefits of our architecture. For instance, over a 64-hour sampling period in March 2001 across a twelve-node RON, there were 32 significant outages, each lasting over thirty minutes, over the 132 measured paths. RON's routing mechanism was able to detect, recover, and route around all of them, in less than twenty seconds on average, showing that its methods for fault detection and recovery work well at discovering alternate paths in the Internet. Furthermore, RON was able to improve the loss rate, latency, or throughput perceived by data transfers; for example, about 5% of the transfers doubled their TCP throughput and 5% of our transfers saw their loss probability reduced by 0.05. We found that forwarding packets via at most one intermediate RON node is sufficient to overcome faults and improve performance in most cases. These improvements, particularly in the area of fault detection and recovery, demonstrate the benefits of moving some of the control over routing into the hands of end-systems.
弹性覆盖网络
弹性覆盖网络(RON)是一种体系结构,它允许分布式互联网应用程序在几秒钟内检测并从路径中断和性能下降期间恢复,改进了目前需要至少几分钟才能恢复的广域路由协议。RON是现有Internet路由基板之上的应用层覆盖层。RON节点监视它们之间Internet路径的功能和质量,并使用这些信息来决定是直接通过Internet路由数据包,还是通过其他RON节点路由数据包,从而优化特定于应用程序的路由指标。部署在分散在Internet上的站点上的工作RON的两组测量结果演示了我们的体系结构的好处。例如,2001年3月,在一个12个节点的RON上进行了64小时的采样,在132条测量路径上出现了32次严重的中断,每次持续时间超过30分钟。RON的路由机制能够在平均不到20秒的时间内检测、恢复并绕过所有这些故障,这表明它的故障检测和恢复方法在发现Internet中的备用路径方面工作得很好。此外,RON能够提高数据传输的损失率、延迟或吞吐量;例如,大约5%的传输将TCP吞吐量提高了一倍,5%的传输的丢失概率降低了0.05。我们发现,在大多数情况下,通过最多一个中间RON节点转发数据包足以克服故障并提高性能。这些改进,特别是在故障检测和恢复领域,证明了将对路由的一些控制转移到终端系统手中的好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信