迪斯科:快速,良好,廉价的停机检测

Anant Shah, Romain Fontugne, E. Aben, C. Pelsser, R. Bush
{"title":"迪斯科:快速,良好,廉价的停机检测","authors":"Anant Shah, Romain Fontugne, E. Aben, C. Pelsser, R. Bush","doi":"10.23919/TMA.2017.8002902","DOIUrl":null,"url":null,"abstract":"Outage detection has been studied from different angles, such as active probing, analysis of background radiations, or control plane information. We approach outage detection from a new perspective. Disco is a detection technique that uses existing long-running TCP connections to identify bursts of disconnections. The benefits are considerable as we can monitor, without adding a single packet to the traffic, Internet-wide swaths of infrastructure that were not monitored previously because they are, for example, not responsive to ICMP probes or behind NATs. With Disco we analyze state changes on connections between RIPE Atlas probes and the RIPE Atlas infrastructure. This data, that is originally logged to monitor probe availability, has a small footprint and is available as a publicly accessible live stream, which makes light-weight near real-time outage detection possible. Probes perform planned traceroute measurements regardless of their connectivity to the RIPE Atlas infrastructure. This gives us a no cost advantage of viewing the outage inside out as the probes experienced it, characterizing the outage after the fact. Thus, we present an outage detection system able to run in near real-time (fast), with a precision of 95% (good), and without generating any new measurement traffic (cheap). We studied historical probe disconnections from 2011 to 2016 and report on the 443 most prominent outages. To validate our results we inspected traceroute results from affected probes and compared our detection to that of Trinocular.","PeriodicalId":118082,"journal":{"name":"2017 Network Traffic Measurement and Analysis Conference (TMA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Disco: Fast, good, and cheap outage detection\",\"authors\":\"Anant Shah, Romain Fontugne, E. Aben, C. Pelsser, R. Bush\",\"doi\":\"10.23919/TMA.2017.8002902\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Outage detection has been studied from different angles, such as active probing, analysis of background radiations, or control plane information. We approach outage detection from a new perspective. Disco is a detection technique that uses existing long-running TCP connections to identify bursts of disconnections. The benefits are considerable as we can monitor, without adding a single packet to the traffic, Internet-wide swaths of infrastructure that were not monitored previously because they are, for example, not responsive to ICMP probes or behind NATs. With Disco we analyze state changes on connections between RIPE Atlas probes and the RIPE Atlas infrastructure. This data, that is originally logged to monitor probe availability, has a small footprint and is available as a publicly accessible live stream, which makes light-weight near real-time outage detection possible. Probes perform planned traceroute measurements regardless of their connectivity to the RIPE Atlas infrastructure. This gives us a no cost advantage of viewing the outage inside out as the probes experienced it, characterizing the outage after the fact. Thus, we present an outage detection system able to run in near real-time (fast), with a precision of 95% (good), and without generating any new measurement traffic (cheap). We studied historical probe disconnections from 2011 to 2016 and report on the 443 most prominent outages. To validate our results we inspected traceroute results from affected probes and compared our detection to that of Trinocular.\",\"PeriodicalId\":118082,\"journal\":{\"name\":\"2017 Network Traffic Measurement and Analysis Conference (TMA)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Network Traffic Measurement and Analysis Conference (TMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/TMA.2017.8002902\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Network Traffic Measurement and Analysis Conference (TMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/TMA.2017.8002902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

摘要

中断检测已经从不同的角度进行了研究,例如主动探测、背景辐射分析或控制平面信息。我们从一个新的角度来处理停机检测。Disco是一种检测技术,它使用现有的长时间运行的TCP连接来识别断开的突发事件。这样做的好处是相当大的,因为我们可以在不向流量中添加单个数据包的情况下监视internet范围内的基础设施,这些基础设施以前没有被监视,例如,因为它们对ICMP探测没有响应或位于nat之后。通过Disco,我们分析了RIPE Atlas探测器和RIPE Atlas基础设施之间连接的状态变化。这些数据最初被记录下来用于监视探针的可用性,它们占用的空间很小,并且可以作为公开访问的实时流使用,这使得轻量级的接近实时的停机检测成为可能。探针执行计划的跟踪路由测量,而不管它们是否连接到RIPE Atlas基础设施。这为我们提供了一个无成本的优势,可以在探针经历停机时从内到外查看停机情况,并在事后描述停机情况。因此,我们提出了一种停机检测系统,能够在接近实时(快速)的情况下运行,精度为95%(好),并且不会产生任何新的测量流量(便宜)。我们研究了从2011年到2016年的历史探针中断,并报告了443次最突出的中断。为了验证我们的结果,我们检查了受影响探针的traceroute结果,并将我们的检测与Trinocular的检测进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Disco: Fast, good, and cheap outage detection
Outage detection has been studied from different angles, such as active probing, analysis of background radiations, or control plane information. We approach outage detection from a new perspective. Disco is a detection technique that uses existing long-running TCP connections to identify bursts of disconnections. The benefits are considerable as we can monitor, without adding a single packet to the traffic, Internet-wide swaths of infrastructure that were not monitored previously because they are, for example, not responsive to ICMP probes or behind NATs. With Disco we analyze state changes on connections between RIPE Atlas probes and the RIPE Atlas infrastructure. This data, that is originally logged to monitor probe availability, has a small footprint and is available as a publicly accessible live stream, which makes light-weight near real-time outage detection possible. Probes perform planned traceroute measurements regardless of their connectivity to the RIPE Atlas infrastructure. This gives us a no cost advantage of viewing the outage inside out as the probes experienced it, characterizing the outage after the fact. Thus, we present an outage detection system able to run in near real-time (fast), with a precision of 95% (good), and without generating any new measurement traffic (cheap). We studied historical probe disconnections from 2011 to 2016 and report on the 443 most prominent outages. To validate our results we inspected traceroute results from affected probes and compared our detection to that of Trinocular.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信