Anup Agarwal, V. Arun, Devdeep Ray, R. Martins, S. Seshan
{"title":"Automating network heuristic design and analysis","authors":"Anup Agarwal, V. Arun, Devdeep Ray, R. Martins, S. Seshan","doi":"10.1145/3563766.3564085","DOIUrl":"https://doi.org/10.1145/3563766.3564085","url":null,"abstract":"Heuristics are ubiquitous in computer systems. Examples include congestion control, adaptive bit rate streaming, scheduling, load balancing, and caching. In some domains, theoretical proofs have provided clarity on the conditions where a heuristic is guaranteed to work well. This has not been possible in all domains because proving such guarantees can involve combinatorial reasoning making it hard, cumbersome and error-prone. In this paper we argue that computers should help humans with the combinatorial part of reasoning. We model reasoning questions as ∃∀ formulas [1] and solve them using the counterexample guided inductive synthesis (CEGIS) framework. As preliminary evidence, we prototype CCmatic, a tool that semi-automatically synthesizes congestion control algorithms that are provably robust. It rediscovered a recent congestion control algorithm that provably achieves high utilization and bounded delay under a challenging network model. It also found previously unknown variants of the algorithm that achieve different throughput-delay trade-offs.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128680813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Full-stack SDN","authors":"Debnil Sur, Ben Pfaff, L. Ryzhyk, M. Budiu","doi":"10.1145/3563766.3564101","DOIUrl":"https://doi.org/10.1145/3563766.3564101","url":null,"abstract":"The conventional approach for building software-defined network systems requires separately developing the management, control, and data planes. Manually written code connects the management plane's configuration to the control plane, and the control plane generates the data planes' configurations as small program fragments that scatter across the codebase. Scalability and correctness become increasingly challenging as such a system develops and grows. In contrast, in our approach, called Nerpa, all three planes are programmed in a unified way. In Nerpa a transactional database stores management plane state. The control plane is implemented in a specialized query language which automatically executes in an incremental fashion, improving scalability. Finally, the data plane is programmed in P4. To aid correctness, all three parts are type-checked together, and tools generate code for data movement between planes. We have published a prototype implementation using an open-source license. We believe that full-stack SDN can build more robust and maintainable networked systems.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133879724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Load balancers need in-band feedback control","authors":"Bhavana Vannarth Shobhana, S. Narayana, B. Nath","doi":"10.1145/3563766.3564094","DOIUrl":"https://doi.org/10.1145/3563766.3564094","url":null,"abstract":"Server load balancers (LBs) are critical components of interactive services, routing client requests to servers in a pool. LBs improve service performance and increase availability by spreading the request load evenly across servers. It is time to rethink what LBs can do for applications. As application compute becomes increasingly granular (e.g., microservices), request-processing latencies at servers will be ever more impacted by software and system variability at small time scales (e.g., 100μs-1ms). Beyond balancing load, we argue that LBs must actively optimize application response time, by adapting request-routing to quickly-varying server performance. Specifically, we advocate for in-band feedback control: LBs should adapt the request-routing policy using purely local observations of server performance, derived from requests traversing the LB. A key challenge to designing such feedback controllers is that high-speed LBs only see the requests, not the responses. We present the design of an LB that adapts to a server latency inflation of 1 ms and reduces tail latencies in milliseconds, while observing only client-to-server traffic.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132773336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiao Xiang, Ridi Wen, Che-Ling Huang, Yuxin Wang, Franck Le
{"title":"Network can check itself: scaling data plane checking via distributed, on-device verification","authors":"Qiao Xiang, Ridi Wen, Che-Ling Huang, Yuxin Wang, Franck Le","doi":"10.1145/3563766.3564095","DOIUrl":"https://doi.org/10.1145/3563766.3564095","url":null,"abstract":"Current data plane verification (DPV) tools employ a centralized architecture, where a server collects the data planes of all devices and verifies them. This architecture is inherently unscalable (i.e., requiring a reliable management network, incurring a long control path and making the server a single point of failure). In this paper, we tackle this scalability challenge of DPV from an architectural perspective. In particular, we circumvent the scalability bottleneck of centralized design and advocate for a distributed, on-device DPV framework. Our key insight is that DPV can be transformed into a counting problem on DAG, which can be naturally decomposed into lightweight tasks executed at network devices, enabling scalability. Evaluation shows that a prototype of this framework achieves scalable DPV under various settings, with little overhead on commodity network devices.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130565076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziqiang Wang, Zhuotao Liu, Xiaoliang Wang, Songtao Fu, Ke Xu
{"title":"DIP","authors":"Ziqiang Wang, Zhuotao Liu, Xiaoliang Wang, Songtao Fu, Ke Xu","doi":"10.1093/gmo/9781561592630.article.j122000","DOIUrl":"https://doi.org/10.1093/gmo/9781561592630.article.j122000","url":null,"abstract":"","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132433074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Schmitt, J. Iyengar, Christopher A. Wood, B. Raghavan
{"title":"The decoupling principle: a practical privacy framework","authors":"Paul Schmitt, J. Iyengar, Christopher A. Wood, B. Raghavan","doi":"10.1145/3563766.3564112","DOIUrl":"https://doi.org/10.1145/3563766.3564112","url":null,"abstract":"The three decade struggle to ensure Internet data confidentiality---a key aspect of communications privacy---is finally behind us. Encryption is fast, secure, and standard in all browsers, modern transports, and major protocols. Yet it has long seemed that network privacy is not unified by core principles but a grab bag of techniques and ideas applied to an equally wide range of applications, contexts, layers of infrastructure, and software stacks. Here we attempt to distill a principle---one that is old but seldom discussed as such---for building privacy into Internet services. We explore what privacy properties are desirable and achievable when we apply this principle. We evaluate several classic systems and ones that have been recently deployed with this principle applied, and discuss future directions for network privacy building upon these efforts.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115060447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saksham Agarwal, R. Agarwal, Behnam Montazeri, M. Moshref, Khaled Elmeleegy, L. Rizzo, M. Kruijf, G. Kumar, S. Ratnasamy, D. Culler, A. Vahdat
{"title":"Understanding host interconnect congestion","authors":"Saksham Agarwal, R. Agarwal, Behnam Montazeri, M. Moshref, Khaled Elmeleegy, L. Rizzo, M. Kruijf, G. Kumar, S. Ratnasamy, D. Culler, A. Vahdat","doi":"10.1145/3563766.3564110","DOIUrl":"https://doi.org/10.1145/3563766.3564110","url":null,"abstract":"We present evidence and characterization of host congestion in production clusters: adoption of high-bandwidth access links leading to emergence of bottlenecks within the host interconnect (NIC-to-CPU data path). We demonstrate that contention on existing IO memory management units and/or the memory subsystem can significantly reduce the available NIC-to-CPU bandwidth, resulting in hundreds of microseconds of queueing delays and eventual packet drops at hosts (even when running a state-of-the-art congestion control protocol that accounts for CPU-induced host congestion). We also discuss implications of host interconnect congestion to design of future host architecture, network stacks and network protocols.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128836572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient flow scheduling in distributed deep learning training with echelon formation","authors":"Rui Pan, Yiming Lei, Jialong Li, Zhiqiang Xie, Binhang Yuan, Yiting Xia","doi":"10.1145/3563766.3564096","DOIUrl":"https://doi.org/10.1145/3563766.3564096","url":null,"abstract":"This paper discusses why flow scheduling does not apply to distributed deep learning training and presents EchelonFlow, the first network abstraction to bridge the gap. EchelonFlow deviates from the common belief that semantically related flows should finish at the same time. We reached the key observation, after extensive workflow analysis of diverse training paradigms, that distributed training jobs observe strict computation patterns, which may consume data at different times. We devise a generic method to model the drastically different computation patterns across training paradigms, and formulate EchelonFlow to regulate flow finish times accordingly. Case studies of mainstream training paradigms under EchelonFlow demonstrate the expressiveness of the abstraction, and our system sketch suggests the feasibility of an EchelonFlow scheduling system.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122706130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Rajasekaran, M. Ghobadi, Gautam Kumar, Aditya Akella
{"title":"Congestion control in machine learning clusters","authors":"S. Rajasekaran, M. Ghobadi, Gautam Kumar, Aditya Akella","doi":"10.1145/3563766.3564115","DOIUrl":"https://doi.org/10.1145/3563766.3564115","url":null,"abstract":"This paper argues that fair-sharing, the holy grail of congestion control algorithms for decades, is not necessarily a desirable property in Machine Learning (ML) training clusters. We demonstrate that for a specific combination of jobs, introducing unfairness improves the training time for all competing jobs. We call this specific combination of jobs compatible and define the compatibility criterion using a novel geometric abstraction. Our abstraction rolls time around a circle and rotates the communication phases of jobs to identify fully compatible jobs. Using this abstraction, we demonstrate up to 1.3× improvement in the average training iteration time of popular ML models. We advocate that resource management algorithms should take job compatibility on network links into account. We then propose three directions to ameliorate the impact of network congestion in ML training clusters: (i) an adaptively unfair congestion control scheme, (ii) priority queues on switches, and (iii) precise flow scheduling.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"IA-15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126557557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianxiang Li, Haofan Lu, Reza Rezvani, A. Abedi, Omid Salehi-Abari
{"title":"Bringing wifi localization to any wifi devices","authors":"Tianxiang Li, Haofan Lu, Reza Rezvani, A. Abedi, Omid Salehi-Abari","doi":"10.1145/3563766.3564090","DOIUrl":"https://doi.org/10.1145/3563766.3564090","url":null,"abstract":"Recent years have seen significant advances in WiFi Localization. However, existing systems require either multiple access points to cooperate with each other or a single access point to have multiple antennas and transceiver chains. Therefore, they cannot be integrated into most IoT WiFi chipsets which have only a single transceiver chain. This paper presents WiSight, a novel approach to bringing WiFi localization to any WiFi devices, especially those with a single RF chain. We propose a WiFi antenna design and use the inherent properties of the 802.11 protocol to measure Angle-of-Arrival (AoA) and Time-of-Flight (ToF) using a single transceiver chain. Our proof-of-concept simulation and real world experiments promise the feasibility of this approach.","PeriodicalId":339381,"journal":{"name":"Proceedings of the 21st ACM Workshop on Hot Topics in Networks","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128959621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}