{"title":"Pitfalls of data-driven networking: A case study of latent causal confounders in video streaming","authors":"P. C. Sruthi, Sanjay G. Rao, Bruno Ribeiro","doi":"10.1145/3405671.3405815","DOIUrl":null,"url":null,"abstract":"This paper motivates the need to support counterfactual reasoning (i.e., answer \"what-if \" questions about events that did not occur) when collecting network data. We focus on video streaming - e.g., given logs of a video session, a video publisher may ask whether a user would continue to experience no rebuffering events if the lowest quality video choice were eliminated. We discuss potential pitfalls related to counterfactual reasoning, and argue that dynamic network state (e.g., bandwidth) serves as a confounding yet hidden (latent) feature that complicates such analyses. We illustrate the challenges, and present preliminary methods to address them using concrete examples. Our evaluations show that existing approaches, including randomized trials (collecting data from an algorithm that selects bitrates randomly), are by themselves inadequate for counterfactual reasoning related to video streaming, and must be supplemented by techniques that explicitly infer latent features.","PeriodicalId":254313,"journal":{"name":"Proceedings of the Workshop on Network Meets AI & ML","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Workshop on Network Meets AI & ML","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3405671.3405815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper motivates the need to support counterfactual reasoning (i.e., answer "what-if " questions about events that did not occur) when collecting network data. We focus on video streaming - e.g., given logs of a video session, a video publisher may ask whether a user would continue to experience no rebuffering events if the lowest quality video choice were eliminated. We discuss potential pitfalls related to counterfactual reasoning, and argue that dynamic network state (e.g., bandwidth) serves as a confounding yet hidden (latent) feature that complicates such analyses. We illustrate the challenges, and present preliminary methods to address them using concrete examples. Our evaluations show that existing approaches, including randomized trials (collecting data from an algorithm that selects bitrates randomly), are by themselves inadequate for counterfactual reasoning related to video streaming, and must be supplemented by techniques that explicitly infer latent features.