A. Basu, Rishi Singh, Chenyang Yu, Amarjeet Prasad, Kunal Banerjee
{"title":"企业规模网络监控系统的设计、开发与部署","authors":"A. Basu, Rishi Singh, Chenyang Yu, Amarjeet Prasad, Kunal Banerjee","doi":"10.1145/3511430.3511446","DOIUrl":null,"url":null,"abstract":"Walmart carries out its retail business across 27 countries both in the form of brick-and-mortar (∼ 11,500 stores and clubs) and e-commerce. To ensure smooth customer experience across the globe, we need to monitor the health of all devices ranging from networking hardware, storage spaces to compute servers spread across geographies all the time. Specifically, we need to monitor which device is facing as issue, when did this happen and what kind of alert does it call for. Swift remediation is carried out in a pro-active manner, i.e., before a device fails, and sometimes in re-active manner, i.e., after a device has failed. Tackling this challenge at an enterprise scale requires various technologies working together in a seamless manner. In this work, we give an insight about how the problem of network monitoring is handled at Walmart and elaborate on the design decisions taken.","PeriodicalId":138760,"journal":{"name":"15th Innovations in Software Engineering Conference","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Designing, Developing and Deploying an Enterprise Scale Network Monitoring System\",\"authors\":\"A. Basu, Rishi Singh, Chenyang Yu, Amarjeet Prasad, Kunal Banerjee\",\"doi\":\"10.1145/3511430.3511446\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Walmart carries out its retail business across 27 countries both in the form of brick-and-mortar (∼ 11,500 stores and clubs) and e-commerce. To ensure smooth customer experience across the globe, we need to monitor the health of all devices ranging from networking hardware, storage spaces to compute servers spread across geographies all the time. Specifically, we need to monitor which device is facing as issue, when did this happen and what kind of alert does it call for. Swift remediation is carried out in a pro-active manner, i.e., before a device fails, and sometimes in re-active manner, i.e., after a device has failed. Tackling this challenge at an enterprise scale requires various technologies working together in a seamless manner. In this work, we give an insight about how the problem of network monitoring is handled at Walmart and elaborate on the design decisions taken.\",\"PeriodicalId\":138760,\"journal\":{\"name\":\"15th Innovations in Software Engineering Conference\",\"volume\":\"86 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"15th Innovations in Software Engineering Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3511430.3511446\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"15th Innovations in Software Engineering Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511430.3511446","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Designing, Developing and Deploying an Enterprise Scale Network Monitoring System
Walmart carries out its retail business across 27 countries both in the form of brick-and-mortar (∼ 11,500 stores and clubs) and e-commerce. To ensure smooth customer experience across the globe, we need to monitor the health of all devices ranging from networking hardware, storage spaces to compute servers spread across geographies all the time. Specifically, we need to monitor which device is facing as issue, when did this happen and what kind of alert does it call for. Swift remediation is carried out in a pro-active manner, i.e., before a device fails, and sometimes in re-active manner, i.e., after a device has failed. Tackling this challenge at an enterprise scale requires various technologies working together in a seamless manner. In this work, we give an insight about how the problem of network monitoring is handled at Walmart and elaborate on the design decisions taken.