{"title":"Toward accurate and practical network tomography","authors":"Denisa Ghita, K. Argyraki, Patrick Thiran","doi":"10.1145/2433140.2433146","DOIUrl":null,"url":null,"abstract":"Troubleshooting large networks is hard; when an end-user complains that she has “network problems,” there is typically a large number of possible causes. For example, the end-user’s own machine may be damaged, misconfigured, or compromised, a network element that handles her traffic may be congested or malfunctioning, or the destination she is trying to reach may be filtering her traffic. To diagnose such problems, a network operator normally has to probe the network’s elements to collect relevant statistics, like packet loss or bandwidth utilization. The challenge, though, is that the network operator often does not have direct access to all the suspected network elements, hence cannot probe them— e.g., the operator of an edge network does not have access to the equipment of her Internet service provider (ISP). Network tomography is an elegant approach to network troubleshooting: just as medical tomography observes an organ from different vantage points and combines the observations to get knowledge of the organ’s internals (without dissecting it), so does network tomography observe the characteristics of different end-to-end network paths and combines the observations to infer the characteristics of individual network links (without probing them). This approach is applicable in scenarios where one needs to monitor the behavior and performance of a network without having direct access to its elements. For instance, the operators of edge networks could use network tomography to monitor the behavior and performance of their ISPs; an ISP operator could use it to monitor the behavior and performance of its peers. However, there are reasons to be skeptical about the usefulness of network tomography in practice. Even though it was invented more than 10 years ago and is still a topic of active research, it has not seen any real deployment. We believe the reason is that existing tomography algorithmsmake certain simplifying assumptions that do not always hold in a real network, which means that the algorithms’ results may be inaccurate. Most importantly, there is no way to determine the extent of this inaccuracy. In other words, today there is no way for a network operator who employs tomography for network troubleshooting to compute the certainty of its diagnosis.","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGOPS Oper. Syst. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2433140.2433146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Troubleshooting large networks is hard; when an end-user complains that she has “network problems,” there is typically a large number of possible causes. For example, the end-user’s own machine may be damaged, misconfigured, or compromised, a network element that handles her traffic may be congested or malfunctioning, or the destination she is trying to reach may be filtering her traffic. To diagnose such problems, a network operator normally has to probe the network’s elements to collect relevant statistics, like packet loss or bandwidth utilization. The challenge, though, is that the network operator often does not have direct access to all the suspected network elements, hence cannot probe them— e.g., the operator of an edge network does not have access to the equipment of her Internet service provider (ISP). Network tomography is an elegant approach to network troubleshooting: just as medical tomography observes an organ from different vantage points and combines the observations to get knowledge of the organ’s internals (without dissecting it), so does network tomography observe the characteristics of different end-to-end network paths and combines the observations to infer the characteristics of individual network links (without probing them). This approach is applicable in scenarios where one needs to monitor the behavior and performance of a network without having direct access to its elements. For instance, the operators of edge networks could use network tomography to monitor the behavior and performance of their ISPs; an ISP operator could use it to monitor the behavior and performance of its peers. However, there are reasons to be skeptical about the usefulness of network tomography in practice. Even though it was invented more than 10 years ago and is still a topic of active research, it has not seen any real deployment. We believe the reason is that existing tomography algorithmsmake certain simplifying assumptions that do not always hold in a real network, which means that the algorithms’ results may be inaccurate. Most importantly, there is no way to determine the extent of this inaccuracy. In other words, today there is no way for a network operator who employs tomography for network troubleshooting to compute the certainty of its diagnosis.