{"title":"AutoTomo: Learning-Based Traffic Estimator Incorporating Network Tomography","authors":"Yan Qiao;Kui Wu;Xinyu Yuan","doi":"10.1109/TNET.2024.3424446","DOIUrl":"10.1109/TNET.2024.3424446","url":null,"abstract":"Estimating the Traffic Matrix (TM) is a critical yet resource-intensive process in network management. With the advent of deep learning models, we now have the potential to learn the inverse mapping from link loads to origin-destination (OD) flows more efficiently and accurately. However, a significant hurdle is that all current learning-based techniques necessitate a training dataset covering a comprehensive TM for a specific duration. This requirement is often unfeasible in practical scenarios. This paper addresses this complex learning challenge, specifically when dealing with incomplete and biased TM data. Our initial approach involves parameterizing the unidentified flows, thereby transforming this problem of target-deficient learning into an empirical optimization problem that integrates tomography constraints. Following this, we introduce AutoTomo, a learning-based architecture designed to optimize both the inverse mapping and the unexplored flows during the model’s training phase. We also propose an innovative observation selection algorithm, which aids network operators in gathering the most insightful measurements with limited device resources. We evaluate AutoTomo with three public traffic datasets Abilene, GÉANT and Cernet. The results reveal that AutoTomo outperforms five state-of-the-art learning-based TM estimation techniques. With complete training data, AutoTomo enhances the accuracy of the most efficient method by 15%, while it shows an improvement between 30% to 56% with incomplete training data. Furthermore, AutoTomo exhibits rapid testing speed, making it a viable tool for real-time TM estimation.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 6","pages":"4644-4659"},"PeriodicalIF":3.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inter-Temporal Reward Strategies in the Presence of Strategic Ethical Hackers","authors":"Jing Hou;Xuyu Wang;Amy Z. Zeng","doi":"10.1109/TNET.2024.3422922","DOIUrl":"10.1109/TNET.2024.3422922","url":null,"abstract":"A skyrocketing increase in cyber-attacks significantly elevates the importance of secure software development. Companies launch various bug-bounty programs to reward ethical hackers for identifying potential vulnerabilities in their systems before malicious hackers can exploit them. One of the most difficult decisions in bug-bounty programs is appropriately rewarding ethical hackers. This paper develops a model of an inter-temporal reward strategy with endogenous e-hacker behaviors. We formulate a novel game model to characterize the interactions between a software vendor and multiple heterogeneous ethical hackers. The optimal levels of rewards are discussed under different reward strategies. The impacts of ethical hackers’ strategic bug-hoarding and their competitive and collaborative behaviors on the performance of the program are also evaluated. We demonstrate the effectiveness of the inter-temporal reward mechanism in attracting ethical hackers and encouraging early bug reports. Our results indicate that ignoring the ethical hackers’ strategic behaviors could result in setting inappropriate rewards, which may inadvertently encourage them to hoard bugs for higher rewards. In addition, a more skilled e-hacker is more likely to delay their reporting and less motivated to work collaboratively with other e-hackers. Moreover, the vendor gains more from e-hacker collaboration when it could significantly increase the speed or probability of uncovering difficult-to-detect vulnerabilities.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4427-4440"},"PeriodicalIF":3.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward Full-Coverage and Low-Overhead Profiling of Network-Stack Latency","authors":"Xiang Chen;Hongyan Liu;Wenbin Zhang;Qun Huang;Dong Zhang;Haifeng Zhou;Xuan Liu;Chunming Wu","doi":"10.1109/TNET.2024.3421327","DOIUrl":"10.1109/TNET.2024.3421327","url":null,"abstract":"In modern data center networks (DCNs), network-stack processing denotes a large portion of the end-to-end latency of TCP flows. So profiling network-stack latency anomalies has been considered as a crucial part in DCN performance diagnosis and troubleshooting. In particular, such profiling requires full coverage (i.e., profiling every TCP packet) and low overhead (i.e., profiling should avoid high CPU consumption in end-hosts). However, existing solutions rely on system calls or tracepoints in end-hosts to implement network-stack latency profiling, leading to either low coverage or high overhead. We propose Torp, a framework that offers full-coverage and low-overhead profiling of network-stack latency. Our key idea is to offload as much of the profiling from costly system calls or tracepoints to the Torp agent built on eBPF modules, and further to include a Torp handler on the ToR switch to accelerate the remaining profiling operations. Torp efficiently coordinates the ToR switch and the Torp agent on end-hosts to jointly execute the entire latency profiling task. We have implemented Torp on \u0000<inline-formula> <tex-math>$32times 100$ </tex-math></inline-formula>\u0000Gbps Tofino switches. Testbed experiments indicate that Torp achieves full coverage and orders of magnitude lower host-side overhead compared to other solutions.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4441-4455"},"PeriodicalIF":3.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Minimizing Edge Caching Service Costs Through Regret-Optimal Online Learning","authors":"Guocong Quan;Atilla Eryilmaz;Ness B. Shroff","doi":"10.1109/TNET.2024.3420758","DOIUrl":"10.1109/TNET.2024.3420758","url":null,"abstract":"Edge caching has been widely implemented to efficiently serve data requests from end users. Numerous edge caching policies have been proposed to adaptively update the cache contents based on various statistics. One critical statistic is the miss cost, which could measure the latency or the bandwidth/energy consumption to resolve the cache miss. Existing caching policies typically assume that the miss cost for each data item is fixed and known. However, in real systems, they could be random with unknown statistics. A promising approach would be to use online learning to estimate the unknown statistics of these random costs, and make caching decisions adaptively. Unfortunately, conventional learning techniques cannot be directly applied, because the caching problem has additional cache capacity and cache update constraints that are not covered in traditional learning settings. In this work, we resolve these issues by developing a novel edge caching policy that learns uncertain miss costs efficiently, and is shown to be asymptotically optimal. We first derive an asymptotic lower bound on the achievable regret. We then design a Kullback-Leibler lower confidence bound (KL-LCB) based edge caching policy, which adaptively learns the random miss costs by following the “optimism in the face of uncertainty” principle. By employing a novel analysis that accounts for the new constraints and the dynamics of the setting, we prove that the regret of the proposed policy matches the regret lower bound, thus showing asymptotic optimality. Further, via numerical experiments we demonstrate the performance improvements of our policy over natural benchmarks.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4349-4364"},"PeriodicalIF":3.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Circling Reduction Algorithm for Cloud Edge Traffic Allocation Under the 95th Percentile Billing","authors":"Pengxiang Zhao;Jintao You;Xiaoming Yuan","doi":"10.1109/TNET.2024.3415649","DOIUrl":"10.1109/TNET.2024.3415649","url":null,"abstract":"In cloud ecosystems, managing bandwidth costs is pivotal for both operational efficiency and service quality. This paper tackles the cloud-edge traffic allocation problem, particularly optimizing for the 95th percentile billing scheme, which is widely employed across various cloud computing scenarios by Internet Service Providers but has yet to be efficiently addressed. We introduce a mathematical model for this issue, confirm its NP-hard complexity, and reformulate it as a mixed-integer programming (MIP). The intricacy of the problem is further magnified by the scale of the cloud ecosystem, involving numerous data centers, client groups, and long billing cycles. Based on a structural analysis of our MIP model, we propose a two-stage solution strategy that retains optimality. We introduce the Circling Reduction Algorithm (CRA), a polynomial-time algorithm based on a rigorously derived lower bound for the objective value, to efficiently determine the binary variables in the first stage, while the remaining linear programming problem in the second stage can be easily resolved. Using the CRA, we develop algorithms for both offline and online traffic allocation scenarios and validate them on real-world datasets from the cloud provider under study. In offline scenarios, our method delivers up to 66.34% cost savings compared to a commercial solver, while also significantly improving computational speed. Additionally, it achieves an average of 14% cost reduction over the current solution of the studied cloud provider. For online scenarios, we achieve an average cost-saving of 8.64% while staying within a 9% gap of the theoretical optimum.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4254-4269"},"PeriodicalIF":3.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenglong Shao;Osamu Muta;Kazuya Tsukamoto;Wonjun Lee;Xianpeng Wang;Malvin Nkomo;Kapil R. Dandekar
{"title":"Toward Improved Energy Fairness in CSMA-Based LoRaWAN","authors":"Chenglong Shao;Osamu Muta;Kazuya Tsukamoto;Wonjun Lee;Xianpeng Wang;Malvin Nkomo;Kapil R. Dandekar","doi":"10.1109/TNET.2024.3418913","DOIUrl":"10.1109/TNET.2024.3418913","url":null,"abstract":"This paper proposes a heterogeneous carrier-sense multiple access (CSMA) protocol named LoHEC as the first research attempt to improve energy fairness when applying CSMA to long-range wide area network (LoRaWAN). LoHEC is enabled by Channel Activity Detection (CAD), a recently introduced carrier-sensing technique to detect LoRaWAN signals even below the noise floor. The design of LoHEC is inspired by the fact that existing CAD-based CSMA proposals are in a homogeneous manner. In other words, they require LoRaWAN end devices to perform identical CAD regardless of the differences of their used network parameter – spreading factor (SF). This causes energy consumption imbalance among end devices since the consumed energy during CAD is significantly affected by SF. By considering the heterogeneity of LoRaWAN in terms of SF, LoHEC requires end devices to perform different numbers of CAD operations with different CAD intervals during channel access. Particularly, the number of needed CADs and CAD interval are determined based on the CAD energy consumption under different SFs. We conduct extensive experiments regarding LoHEC with a practical LoRaWAN testbed including 60 commercial off-the-shelf end devices. Experimental results show that in comparison with the existing solutions, LoHEC can achieve up to \u0000<inline-formula> <tex-math>$0.85times $ </tex-math></inline-formula>\u0000 improvement of the energy fairness on average.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4382-4397"},"PeriodicalIF":3.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized Pricing Through Strategic User Profiling in Social Networks","authors":"Qinqi Lin;Lingjie Duan;Jianwei Huang","doi":"10.1109/TNET.2024.3410976","DOIUrl":"10.1109/TNET.2024.3410976","url":null,"abstract":"Traditional user profiling techniques rely on browsing history or purchase records to identify users’ willingness to pay. This enables sellers to offer personalized prices to profiled users while charging only a uniform price to non-profiled users. However, the emergence of privacy-enhancing technologies has caused users to actively avoid on-site data tracking. Today, major online sellers have turned to public platforms such as online social networks to better track users’ profiles from their product-related discussions. This paper presents the first analytical study on how users should best manage their social activities against potential personalized pricing, and how a seller should strategically adjust her pricing scheme to facilitate user profiling in social networks. We formulate a dynamic Bayesian game played between the seller and users under asymmetric information. The key challenge of analyzing this game comes from the double couplings between the seller and the users as well as among the users. Furthermore, the equilibrium analysis needs to ensure consistency between users’ revealed information and the seller’s belief under random user profiling. We address these challenges by alternately applying backward and forward induction, and successfully characterize the unique perfect Bayesian equilibrium (PBE) in closed form. Our analysis reveals that as the accuracy of profiling technology improves, the seller tends to raise the equilibrium uniform price to motivate users’ increased social activities and facilitate user profiling. However, this results in most users being worse off after the informed consent policy is imposed to ensure users’ awareness of data access and profiling practices by potential sellers. This finding suggests that recent regulatory evolution towards enhancing users’ privacy awareness may have unintended consequences of reducing users’ payoffs. Finally, we examine prevalent pricing practices where the seller breaks a pricing promise to personalize final offerings, and show that it only slightly improves the seller’s average revenue while introducing higher variance.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"3977-3992"},"PeriodicalIF":3.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Whittle Index-Based Q-Learning for Wireless Edge Caching With Linear Function Approximation","authors":"Guojun Xiong;Shufan Wang;Jian Li;Rahul Singh","doi":"10.1109/TNET.2024.3417351","DOIUrl":"10.1109/TNET.2024.3417351","url":null,"abstract":"We consider the problem of content caching at the wireless edge to serve a set of end users via unreliable wireless channels so as to minimize the average latency experienced by end users due to the constrained wireless edge cache capacity. We formulate this problem as a Markov decision process, or more specifically a restless multi-armed bandit problem, which is provably hard to solve. We begin by investigating a discounted counterpart, and prove that it admits an optimal policy of the threshold-type. We then show that this result also holds for average latency problem. Using this structural result, we establish the indexability of our problem, and employ the Whittle index policy to minimize average latency. Since system parameters such as content request rates and wireless channel conditions are often unknown and time-varying, we further develop a model-free reinforcement learning algorithm dubbed as \u0000<monospace>Q+-Whittle</monospace>\u0000 that relies on Whittle index policy. However, \u0000<monospace>Q+-Whittle</monospace>\u0000 requires to store the Q-function values for all state-action pairs, the number of which can be extremely large for wireless edge caching. To this end, we approximate the Q-function by a parameterized function class with a much smaller dimension, and further design a \u0000<monospace>Q+-Whittle</monospace>\u0000 algorithm with linear function approximation, which is called \u0000<monospace>Q+-Whittle-LFA</monospace>\u0000. We provide a finite-time bound on the mean-square error of \u0000<monospace>Q+-Whittle-LFA</monospace>\u0000. Simulation results using real traces demonstrate that \u0000<monospace>Q+-Whittle-LFA</monospace>\u0000 yields excellent empirical performance.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4286-4301"},"PeriodicalIF":3.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141529295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PARING: Joint Task Placement and Routing for Distributed Training With In-Network Aggregation","authors":"Yuhang Qiu;Gongming Zhao;Hongli Xu;He Huang;Chunming Qiao","doi":"10.1109/TNET.2024.3414853","DOIUrl":"10.1109/TNET.2024.3414853","url":null,"abstract":"With the increase in both the model size and dataset size of distributed training (DT) tasks, communication between the workers and parameter servers (PSs) in a cluster has become a bottleneck. In-network aggregation (INA) enabled by programmable switches has been proposed as a promising solution to alleviate the communication bottleneck. However, existing works focused on in-network aggregation implementation based on simple DT placement and fixed routing policies, which may lead to a large communication overhead and inefficient use of resources (e.g., storage, computing power and bandwidth). In this paper, we propose PARING, the first-of-its-kind INA approach that jointly optimizes DT task placement and routing in order to reduce traffic volume and minimize communication time. We formulate the problem as a nonlinear multi-objective mixed-integer programming problem, and prove its NP-Hardness. Based on the concept of Steiner trees, an algorithm with bounded approximation factors is proposed for this problem. Large-scale simulations show that our algorithm can reduce communication time by up to 81.0% and traffic volume by up to 19.1% compared to the state-of-the-art algorithms.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4317-4332"},"PeriodicalIF":3.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyan Liu;Xiang Chen;Qun Huang;Guoqiang Sun;Peiqiao Wang;Dong Zhang;Chunming Wu;Xuan Liu;Qiang Yang
{"title":"Toward Resource-Efficient and High- Performance Program Deployment in Programmable Networks","authors":"Hongyan Liu;Xiang Chen;Qun Huang;Guoqiang Sun;Peiqiao Wang;Dong Zhang;Chunming Wu;Xuan Liu;Qiang Yang","doi":"10.1109/TNET.2024.3413388","DOIUrl":"10.1109/TNET.2024.3413388","url":null,"abstract":"Programmable switches allow administrators to customize packet processing behaviors in data plane programs. However, existing solutions for program deployment fail to achieve resource efficiency and high packet processing performance. In this paper, we propose SPEED, a system that provides resource-efficient and high-performance deployment for data plane programs. For resource efficiency, SPEED merges input data plane programs by reducing program redundancy. Then it abstracts the substrate network into an one big switch (OBS), and deploys the merged program on the OBS while minimizing resource usage. For high performance, SPEED searches for the performance-optimal mapping between the OBS and the substrate network with respect to network-wide constraints. It also maintains program logic among different switches via inter-device packet scheduling. We have implemented SPEED on a Barefoot Tofino switch. The evaluation indicates that SPEED achieves resource-efficient and high-performance deployment for real data plane programs.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4270-4285"},"PeriodicalIF":3.0,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141523079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}