D. Weitzel, Shawn McKee, B. Bockelman, J. Thiltges, M. Babik, I. Vukotic
{"title":"业务分析与网络诊断数据管道","authors":"D. Weitzel, Shawn McKee, B. Bockelman, J. Thiltges, M. Babik, I. Vukotic","doi":"10.1109/indis54524.2021.00006","DOIUrl":null,"url":null,"abstract":"Modern network performance monitoring toolkits, such as perfSONAR, take a remarkable number of measurements about the local network environment. To gain a complete picture of network performance, however, one needs to aggregate data across a large number of endpoints. The Service Analysis and Network Diagnosis (SAND) data pipeline collects data from diverse sources and ingests these measurements into a message bus. The message bus allows the project to send the data to multiple consumers, including a tape archive, an Elasticsearch database, and a peer infrastructure at CERN. In this paper, we explain the architecture and evolution of the SAND data pipeline, the scale of the resulting dataset, and how it supports a wide variety of network analysis applications.","PeriodicalId":351712,"journal":{"name":"2021 IEEE Workshop on Innovating the Network for Data-Intensive Science (INDIS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Service Analysis and Network Diagnosis Data Pipeline\",\"authors\":\"D. Weitzel, Shawn McKee, B. Bockelman, J. Thiltges, M. Babik, I. Vukotic\",\"doi\":\"10.1109/indis54524.2021.00006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern network performance monitoring toolkits, such as perfSONAR, take a remarkable number of measurements about the local network environment. To gain a complete picture of network performance, however, one needs to aggregate data across a large number of endpoints. The Service Analysis and Network Diagnosis (SAND) data pipeline collects data from diverse sources and ingests these measurements into a message bus. The message bus allows the project to send the data to multiple consumers, including a tape archive, an Elasticsearch database, and a peer infrastructure at CERN. In this paper, we explain the architecture and evolution of the SAND data pipeline, the scale of the resulting dataset, and how it supports a wide variety of network analysis applications.\",\"PeriodicalId\":351712,\"journal\":{\"name\":\"2021 IEEE Workshop on Innovating the Network for Data-Intensive Science (INDIS)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Workshop on Innovating the Network for Data-Intensive Science (INDIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/indis54524.2021.00006\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Innovating the Network for Data-Intensive Science (INDIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/indis54524.2021.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
现代网络性能监控工具包,如perfSONAR,对本地网络环境进行了大量测量。然而,为了全面了解网络性能,需要跨大量端点聚合数据。业务分析和网络诊断(Service Analysis and Network Diagnosis, SAND)数据管道从不同的来源收集数据,并将这些测量数据摄取到消息总线中。消息总线允许项目将数据发送给多个消费者,包括磁带归档、Elasticsearch数据库和CERN的对等基础设施。在本文中,我们解释了SAND数据管道的架构和演变,结果数据集的规模,以及它如何支持各种网络分析应用。
The Service Analysis and Network Diagnosis Data Pipeline
Modern network performance monitoring toolkits, such as perfSONAR, take a remarkable number of measurements about the local network environment. To gain a complete picture of network performance, however, one needs to aggregate data across a large number of endpoints. The Service Analysis and Network Diagnosis (SAND) data pipeline collects data from diverse sources and ingests these measurements into a message bus. The message bus allows the project to send the data to multiple consumers, including a tape archive, an Elasticsearch database, and a peer infrastructure at CERN. In this paper, we explain the architecture and evolution of the SAND data pipeline, the scale of the resulting dataset, and how it supports a wide variety of network analysis applications.