IEEE Transactions on Parallel and Distributed Systems最新文献

筛选
英文 中文
Beelog: Online Log Compaction for Dependable Systems
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-02-13 DOI: 10.1109/TPDS.2025.3541628
Luiz Gustavo C. Xavier;Cristina Meinhardt;Odorico Machado Mendizabal
{"title":"Beelog: Online Log Compaction for Dependable Systems","authors":"Luiz Gustavo C. Xavier;Cristina Meinhardt;Odorico Machado Mendizabal","doi":"10.1109/TPDS.2025.3541628","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3541628","url":null,"abstract":"Logs are a known abstraction used to develop dependable and secure distributed systems. By logging entries on a sequential global log, systems can synchronize updates over replicas and provide a consistent state recovery in the presence of faults. However, their usage incurs a non-negligible overhead on the application's performance. This article presents Beelog, an approach to reduce logging impact and accelerate recovery on log-based protocols by safely discarding entries from logs. The technique involves executing a log compaction during run-time concurrently with the persistence and execution of commands. Besides compacting logging information, the proposed technique splits the log file and incorporates strategies to reduce logging overhead, such as batching and parallel I/O. We evaluate the proposed approach by implementing it as a new feature of the etcd key-value store and comparing it against etcd's standard logging. Utilizing workloads from the YCSB benchmark and experimenting with different configurations for batch size and number of storage devices, our results indicate that Beelog can reduce application recovery time, especially in write-intensive workloads with a small number of keys and a probability favoring the most recent keys to be updated. In such scenarios, we observed up to a 50% compaction in the log file size and a 65% improvement in recovery time compared to etcd's standard recovery protocol. As a side effect, batching results in higher command execution latency, ranging from <inline-formula><tex-math>$ text{100 ms}$</tex-math></inline-formula> to <inline-formula><tex-math>$ text{350 ms}$</tex-math></inline-formula> with Beelog, compared to the default etcd's <inline-formula><tex-math>$ text{90 ms}$</tex-math></inline-formula>. Except for the latency increase, the proposed technique does not impose other significant performance costs, making it a practical solution for systems where fast recovery and reduced storage are priorities.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"689-700"},"PeriodicalIF":5.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy Efficient and Multi-Resource Optimization for Virtual Machine Placement by Improving MOEA/D
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-02-11 DOI: 10.1109/TPDS.2025.3538525
Wenting Wei;Huaxi Gu;Zhe Xiao;Yi Chen
{"title":"Energy Efficient and Multi-Resource Optimization for Virtual Machine Placement by Improving MOEA/D","authors":"Wenting Wei;Huaxi Gu;Zhe Xiao;Yi Chen","doi":"10.1109/TPDS.2025.3538525","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3538525","url":null,"abstract":"The explosive growth of cloud services has led to the widespread construction of large-scale data centers to meet diverse and multifaceted cloud computing demands. However, this expansion has resulted in substantial energy consumption. Virtual machine placement (VMP) has been extensively studied as a means to provide flexible and scalable cloud services while optimizing energy efficiency. Yet, the increasing complexity and diversity of applications have posted VMP suffering from waste of resources and bottlenecks due to unbalanced utilization of multi-dimensional resources. To address these issues, this article proposes a bi-objective optimization model for VMP that jointly optimizes power consumption and multi-dimensional resource utilization. Solving this large-scale bi-objective model presents a significant challenge in balancing performance and computational complexity. To tackle this, an enhanced decomposition-based multi-objective evolutionary algorithm (MOEA/D) based on <inline-formula><tex-math>$varepsilon$</tex-math></inline-formula>-domination, termed <inline-formula><tex-math>$varepsilon$</tex-math></inline-formula>-IMOEA/D-M2M is designed to provide solutions for the proposed optimization. Compared with both heuristics and evolutionary algorithms, performance evaluations demonstrate that our proposed VMP algorithm effectively reduces power consumption and balances multidimensional resource utilization while significantly decreasing running time compared to both heuristic and traditional evolutionary algorithms.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 6","pages":"1087-1099"},"PeriodicalIF":5.6,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143845573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task-Aware Service Placement for Distributed Learning in Wireless Edge Networks
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-02-10 DOI: 10.1109/TPDS.2025.3539620
Rong Cong;Zhiwei Zhao;Mengfan Wang;Geyong Min;Jiangshu Liu;Jiwei Mo
{"title":"Task-Aware Service Placement for Distributed Learning in Wireless Edge Networks","authors":"Rong Cong;Zhiwei Zhao;Mengfan Wang;Geyong Min;Jiangshu Liu;Jiwei Mo","doi":"10.1109/TPDS.2025.3539620","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3539620","url":null,"abstract":"Machine learning has been a driving force in the evolution of tremendous computing services and applications in the past decade. Traditional learning systems rely on centralized training and inference, which poses serious privacy and security concerns. To solve this problem, distributed learning over wireless edge networks (DLWENs) emerges as a trending solution and has attracted increasing research interests. In DLWENs, corresponding services need to be placed onto the edge servers to process the distributed tasks. Apparently, different placement of training services can significantly affect the performance of all distributed learning tasks. In this article, we propose TASP, a task-aware service placement scheme for distributed learning in wireless edge networks. By carefully considering the structures (directed acyclic graphs) of the distributed learning tasks, the fine-grained task requests and inter-task dependencies are incorporated into the placement strategies to realize the parallel computation of learning services. We also exploit queuing theory to characterize the dynamics caused by task uncertainties. Extensive experiments based on the Alibaba ML dataset show that, compared to the state-of-the-art schemes, the proposed work reduces the overall delay of distributed learning tasks by 38.6% on average.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"731-744"},"PeriodicalIF":5.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flips: A Flexible Partitioning Strategy Near Memory Processing Architecture for Recommendation System
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-02-06 DOI: 10.1109/TPDS.2025.3539534
Yudi Qiu;Lingfei Lu;Shiyan Yi;Minge Jing;Xiaoyang Zeng;Yang Kong;Yibo Fan
{"title":"Flips: A Flexible Partitioning Strategy Near Memory Processing Architecture for Recommendation System","authors":"Yudi Qiu;Lingfei Lu;Shiyan Yi;Minge Jing;Xiaoyang Zeng;Yang Kong;Yibo Fan","doi":"10.1109/TPDS.2025.3539534","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3539534","url":null,"abstract":"Personalized recommendation systems are massively deployed in production data centers. The memory-intensive embedding layers of recommendation systems are the crucial performance bottleneck, with operations manifesting as sparse memory lookups and simple reduction computations. Recent studies propose near-memory processing (NMP) architectures to speed up embedding operations by utilizing high internal memory bandwidth. However, these solutions typically employ a fixed vector partitioning strategy that fail to adapt to changes in data center deployment scenarios and lack practicality. We propose Flips, a <underline>fl</u>ex<underline>i</u>ble <underline>p</u>artitioning <underline>s</u>trategy NMP architecture that accelerates embedding layers. Flips supports more than ten partitioning strategies through hardware-software co-design. Novel hardware architectures and address mapping schemes are designed for the memory-side and host-side. We provide two approaches to determine the optimal partitioning strategy for each embedding table, enabling the architecture to accommodate changes in deployment scenarios. Importantly, Flips is decoupled from the NMP level and can utilize rank-level, bank-group-level and bank-level parallelism. In peer-level NMP evaluations, Flips outperforms state-of-the-art NMP solutions, RecNMP, TRiM, and ReCross by up to 4.0×, 4.1×, and 3.5×, respectively.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"745-758"},"PeriodicalIF":5.6,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EfficientMoE: Optimizing Mixture-of-Experts Model Training With Adaptive Load Balance
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-02-06 DOI: 10.1109/TPDS.2025.3539297
Yan Zeng;Chengchuang Huang;Yipeng Mei;Lifu Zhang;Teng Su;Wei Ye;Wenqi Shi;Shengnan Wang
{"title":"EfficientMoE: Optimizing Mixture-of-Experts Model Training With Adaptive Load Balance","authors":"Yan Zeng;Chengchuang Huang;Yipeng Mei;Lifu Zhang;Teng Su;Wei Ye;Wenqi Shi;Shengnan Wang","doi":"10.1109/TPDS.2025.3539297","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3539297","url":null,"abstract":"Mixture-of-Experts (MoE) efficiently trains large models by using sparse activation to lower costs, selecting a few experts based on data characteristics. However, it faces challenges such as All-to-All communication overhead and load imbalance, with most optimizations targeting dynamic graphs rather than the more efficient static graphs. This study identifies two key challenges in training MoE on static graphs: 1) excessive All-to-All communication (up to 75% of iteration time) and load imbalance (70% of tokens handled by two experts) between experts due to the sparse structure of the MoE model and the token distribution; and 2) inefficient zero-padding for static shapes, leading to unnecessary computational overhead(wasting approximately 50% of resources). Thus, EfficientMoE, a scheduling method based on expert load and data characteristics, is introduced. EfficientMoE first designs a sampler to collect real-time information about token distribution, expert load, etc. It constructs a load prediction model to evaluate expert load. Subsequently, EfficientMoE proposes a dynamic schedule strategy for experts with evaluated expert load, reducing All-to-All communication and addressing load-balancing issues. Additionally, an expert capacity model is proposed to set different capacities for replicas of hot experts before static graph compilation, minimizing computation and storage overhead caused by significant padding. This study implements EfficientMoE in MindSpore and uses 32 Ascend AI accelerators to train an MoE model with 21 billion parameters and evaluate its validity. EfficientMoE demonstrated an improvement of 30% in model training time, approximately 12% reduction in communication time, and saved 35% computational resources across different clusters, compared with Switch transformers, and the Fastermoe method for static graphs.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"677-688"},"PeriodicalIF":5.6,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Multi Objective Shortest Path Update Algorithm in Large Dynamic Networks
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-30 DOI: 10.1109/TPDS.2025.3536357
S. M. Shovan;Arindam Khanda;Sajal K. Das
{"title":"Parallel Multi Objective Shortest Path Update Algorithm in Large Dynamic Networks","authors":"S. M. Shovan;Arindam Khanda;Sajal K. Das","doi":"10.1109/TPDS.2025.3536357","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3536357","url":null,"abstract":"The multi objective shortest path (MOSP) problem, crucial in various practical domains, seeks paths that optimize multiple objectives. Due to its high computational complexity, numerous parallel heuristics have been developed for static networks. However, real-world networks are often dynamic where the network topology changes with time. Efficiently updating the shortest path in such networks is challenging, and existing algorithms for static graphs are inadequate for these dynamic conditions, necessitating novel approaches. Here, we first develop a parallel algorithm to efficiently update a single objective shortest path (SOSP) in fully dynamic networks, capable of accommodating both edge insertions and deletions. Building on this, we propose <italic><b>DynaMOSP</b></i>, a parallel heuristic for <bold>Dyna</b>mic <bold>M</b>ulti <bold>O</b>bjective <bold>S</b>hortest <bold>P</b>ath searches in large, fully dynamic networks. We provide a theoretical analysis of the conditions to achieve Pareto optimality. Furthermore, we devise a dedicated shared memory CPU implementation along with a version for heterogeneous computing environments. Empirical analysis on eight real-world graphs demonstrates that our method scales effectively. The shared memory CPU implementation achieves an average speedup of 12.74× and a maximum of 57.22×, while on an Nvidia GPU, it attains an average speedup of 69.19×, reaching up to 105.39× when compared to state-of-the-art techniques.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 5","pages":"932-944"},"PeriodicalIF":5.6,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Loci: Federated Continual Learning of Heterogeneous Tasks at Edge
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-29 DOI: 10.1109/TPDS.2025.3531123
Yaxin Luopan;Rui Han;Qinglong Zhang;Xiaojiang Zuo;Chi Harold Liu;Guoren Wang;Lydia Y. Chen
{"title":"Loci: Federated Continual Learning of Heterogeneous Tasks at Edge","authors":"Yaxin Luopan;Rui Han;Qinglong Zhang;Xiaojiang Zuo;Chi Harold Liu;Guoren Wang;Lydia Y. Chen","doi":"10.1109/TPDS.2025.3531123","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3531123","url":null,"abstract":"Federated continual learning (FCL) has attracted growing attention in achieving collaborative model training among edge clients, each of which learns its local model for a sequence of tasks. Most existing FCL approaches aggregate clients’ latest local models to exchange knowledge. This unfortunately deviates from real-world scenarios where each model is optimized independently using the client’s own dynamic data and different clients have heterogeneous tasks. These tasks not only have distinct class labels (e.g., animals or vehicles) but also differ in input feature distributions. The aggregated model thus often shifts to a higher loss value and incurs accuracy degradation. In this article, we depart from the model-grained view of aggregation and transform it into multiple task-grained aggregations. Each aggregation allows a client to learn from other clients to improve its model accuracy on one task. To this end, we propose Loci to provide abstractions for clients’ past and peer task knowledge using compact model weights, and develop a communication-efficient approach to train each client’s local model by exchanging its tasks’ knowledge with the most accuracy relevant one from other clients. Through its general-purpose API, Loci can be used to provide efficient on-device training for existing deep learning applications of graph, image, nature language processing, and multimodal data. Using extensive comparative evaluations, we show Loci improves the model accuracy by 32.48% without increasing training time, reduces communication cost by 83.6%, and achieves more improvements when scale (task/client number) increases.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"775-790"},"PeriodicalIF":5.6,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143553186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FHE4DMM: A Low-Latency Distributed Matrix Multiplication With Fully Homomorphic Encryption FHE4DMM:具有完全同态加密功能的低延迟分布式矩阵乘法器
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-28 DOI: 10.1109/TPDS.2025.3534846
Yi Chen;Qiang-Sheng Hua;Zixiao Hong;Lin Zhu;Hai Jin
{"title":"FHE4DMM: A Low-Latency Distributed Matrix Multiplication With Fully Homomorphic Encryption","authors":"Yi Chen;Qiang-Sheng Hua;Zixiao Hong;Lin Zhu;Hai Jin","doi":"10.1109/TPDS.2025.3534846","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3534846","url":null,"abstract":"Fully Homomorphic Encryption (FHE) is a promising technology for secure, non-interactive outsourced computation. One notable method to increase the throughput of FHE-based outsourcing is batching, which typically involves large-scale matrix-matrix multiplications (MM). However, the substantial overhead inherent in existing FHE schemes poses a major challenge for processing these large-scale tasks, often resulting in insufficient memory or prolonged delays on a single machine, making it practically unviable. Utilizing multi-machine parallelism in cloud clusters for outsourced computation offers a natural solution to these obstacles. In this work, we propose FHE4DMM, a distributed algorithm that provides a unified view on encrypted matrices, accommodating various FHE schemes and any matrix dimensions, to accelerate large-scale encrypted MM. A key innovation is its reuse optimizations for parallelized homomorphic computations, which can offer valuable insights for broader FHE-based applications. We utilized FHE4DMM to conduct large-scale square (<inline-formula><tex-math>$4096times 4096$</tex-math></inline-formula>) and rectangular (<inline-formula><tex-math>$32768times 32768,32768times 16$</tex-math></inline-formula> ) matrix multiplications on 256 machines, achieving computation time of 172.2 s and 76.1 s, respectively, while ensuring a 128-bit security level. For scalability, the experiments demonstrate that FHE4DMM achieves linear speedup for <inline-formula><tex-math>$2^{i}$</tex-math></inline-formula> (<inline-formula><tex-math>$i$</tex-math></inline-formula> is from 0 to 6) machines across various matrix dimension cases. In addition, within the range of matrix dimensions that the state-of-the-art (SOTA) distributed FHE-MM algorithm (Huang et al. 2023) can handle, FHE4DMM attains a maximum speedup of 16.62x. To assess its practical performance, FHE4DMM is applied in a basic multi-layer feedforward network. We used 64 machines to perform secure outsourced inference on MNIST and CIFAR-10 datasets with encrypted models and data. Compared to using the SOTA, our method achieved speedups of up to 3.54x and 4.22x respectively, with the MM module obtaining a 4.09x and 4.87x speedup.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"645-658"},"PeriodicalIF":5.6,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10856418","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Collaborative Service Composition Approach Considering Providers’ Self-Interest and Minimal Service Sharing
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-27 DOI: 10.1109/TPDS.2025.3534283
Xiao Wang;Hanchuan Xu;Jian Yang;Xiaofei Xu;Zhongjie Wang
{"title":"A Collaborative Service Composition Approach Considering Providers’ Self-Interest and Minimal Service Sharing","authors":"Xiao Wang;Hanchuan Xu;Jian Yang;Xiaofei Xu;Zhongjie Wang","doi":"10.1109/TPDS.2025.3534283","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3534283","url":null,"abstract":"Service composition dynamically integrates various services from multiple providers to meet complex user requirements. However, most existing methods assume centralized control over all services, which is often unrealistic because providers typically prefer to independently manage their own services, posing challenges to the application of traditional methods. Collaborative service composition offers a solution by enabling providers to work together to complete service composition. However, this approach also faces its own challenges. Driven by self-interest, providers may be reluctant to offer services needed by others, and due to business competition, they may wish to share as few services as possible (where sharing services means disclosing service information to other providers). To address these challenges, we propose a novel collaborative service composition approach that comprehensively considers each provider’s self-interest and achieves service composition with minimal service sharing. First, we introduce a “self-interest degree” model to capture providers’ self-interest. This behavior may lead to service refusal, so we design a service availability prediction method based on a reputation model to minimize rejections. Then, we propose a decentralized service composition method. It utilizes historical composition records to mine empirical rules between requirements and services, constructing a correlations matrix, and collaboratively trains a multi-label classification model with other providers under a distributed federated learning framework. Combining the matrix and model outputs, we design a service composition method and a node coordination protocol that completes service composition with minimal service sharing. Experimental results demonstrate the effectiveness of the proposed method in capturing providers’ self-interest and showcase its superior performance compared to existing methods.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"598-615"},"PeriodicalIF":5.6,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monte: SFCs Migration Scheme in the Distributed Programmable Data Plane
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-21 DOI: 10.1109/TPDS.2025.3532467
Xiaoquan Zhang;Lin Cui;Fung Po Tso;Yuhui Deng;Zhetao Li;Weijia Jia
{"title":"Monte: SFCs Migration Scheme in the Distributed Programmable Data Plane","authors":"Xiaoquan Zhang;Lin Cui;Fung Po Tso;Yuhui Deng;Zhetao Li;Weijia Jia","doi":"10.1109/TPDS.2025.3532467","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3532467","url":null,"abstract":"Service function chains (SFCs) are sequences of network functions that provide specific services to meet operators’ needs in today's ISPs and datacenter networks. To improve the performance of SFCs, programmable data planes are used to leverage their low latency and high performance packet processing. However, SFCs need to be adaptable to dynamics such as changes in requirements and attributes. Therefore, the ability to migrate SFCs is essential. Unfortunately, migrating SFCs in distributed programmable data planes is challenging due to the risk of degraded performance and failure to meet SFCs requirements and resource constraints in switches. In this paper, we propose <italic>Monte</i>, which provides an effective SFCs migration scheme in distributed programmable data planes. We build a novel integer programming model to represent the migration process with constraints on resource limitations of switches and SFCs attributes in the distributed data plane. Additionally, an SFCs migration algorithm is designed to optimize the migration cost by deeply analyzing resource allocation in the switch pipeline. <italic>Monte</i> has been implemented on both P4 software switches (Bmv2) and hardware switches (Intel Tofino ASIC). Extensive evaluation results show that the migration cost in <italic>Monte</i> is 94.03% lower on average than the state-of-the-art deployment scheme, and <italic>Monte</i> can effectively save pipeline resources.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"633-644"},"PeriodicalIF":5.6,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信