IEEE Transactions on Parallel and Distributed Systems最新文献

筛选
英文 中文
ISACPP: Interference-Aware Scheduling Approach for Deep Learning Training Workloads Based on Co-Location Performance Prediction 基于协同位置性能预测的深度学习训练工作负载干扰感知调度方法
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-06-09 DOI: 10.1109/TPDS.2025.3577796
Zijie Liu;Yi Cheng;Can Chen;Jun Hu;Rongguo Fu;Dengyin Zhang
{"title":"ISACPP: Interference-Aware Scheduling Approach for Deep Learning Training Workloads Based on Co-Location Performance Prediction","authors":"Zijie Liu;Yi Cheng;Can Chen;Jun Hu;Rongguo Fu;Dengyin Zhang","doi":"10.1109/TPDS.2025.3577796","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3577796","url":null,"abstract":"Traditional exclusive cloud resource allocation for deep learning training (DLT) workloads is unsuitable for advanced GPU infrastructure, leading to resource under-utilization. Fortunately, DLT workload co-location provides a promising way to improve resource utilization. However, existing workload co-location methods fail to accurately quantify interference among DLT workloads, resulting in performance degradation. To address this problem, this article proposes an interference-aware scheduling approach for DLT workloads based on co-location performance prediction, dubbed ‘ISACPP’. ISACPP first builds an edge-fusion gated graph attention network (E-GGAT) that incorporates DL model structures, underlying GPU types, and hyper-parameter settings to predict co-location performance. Since the co-location state changes as each workload is completed, ISACPP proposes a multi-stage co-location interference quantification model derived from the predicted co-location performance to identify the GPU device with the minimum overall interference. Experimental results demonstrate that ISACPP can accurately estimate the co-location performance of DLT workloads with a maximum prediction error of 8.72%, 1.9%, and 4.4% for execution time, GPU memory consumption, and GPU utilization, respectively. Meanwhile, ISACPP can significantly shorten workload makespan by up to 34.9% compared to state-of-the-art interference-aware scheduling methods.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 8","pages":"1591-1607"},"PeriodicalIF":5.6,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RHINO: An Efficient Serverless Container System for Small-Scale HPC Applications RHINO:用于小型高性能计算应用的高效无服务器容器系统
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-06-04 DOI: 10.1109/TPDS.2025.3576584
He Zhu;Mingyu Li;Haihang You
{"title":"RHINO: An Efficient Serverless Container System for Small-Scale HPC Applications","authors":"He Zhu;Mingyu Li;Haihang You","doi":"10.1109/TPDS.2025.3576584","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3576584","url":null,"abstract":"Serverless computing, characterized by its pay-as-you-go and auto-scaling features, offers a promising alternative for High Performance Computing (HPC) applications, as traditional HPC clusters often face long waiting times and resources over/under-provisioning. However, current serverless platforms struggle to support HPC applications due to restricted inter-function communication and high coupling runtime. To address these issues, we introduce RHINO, which offers end-to-end support for the development and deployment of serverless HPC. Using the Two-Step Adaptive Build strategy, the HPC code is packaged into lightweight, scalable functions. The Rhino Function Execution Model decouples HPC applications from the underlying infrastructures. The Auto-scaling Engine dynamically scales cloud resources and schedules tasks based on performance and cost requirements. We deploy RHINO on AWS Fargate and evaluate it on both benchmarks and real-world workloads. Experimental results show that, when compared to the traditional VM clusters, RHINO can achieve a performance improvement of 10% –30% for small-scale applications and more than 40% cost reduction.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 8","pages":"1560-1573"},"PeriodicalIF":5.6,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OpenSN: An Open Source Library for Emulating LEO Satellite Networks openn:一个用于模拟LEO卫星网络的开源库
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-06-02 DOI: 10.1109/TPDS.2025.3575920
Wenhao Lu;Zhiyuan Wang;Hefan Zhang;Shan Zhang;Hongbin Luo
{"title":"OpenSN: An Open Source Library for Emulating LEO Satellite Networks","authors":"Wenhao Lu;Zhiyuan Wang;Hefan Zhang;Shan Zhang;Hongbin Luo","doi":"10.1109/TPDS.2025.3575920","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3575920","url":null,"abstract":"Low-earth-orbit (LEO) satellite constellations (e.g., Starlink) are becoming a necessary component of future Internet. There have been increasing studies on LEO satellite networking. It is a crucial problem how to evaluate these studies in a systematic and reproducible manner. In this paper, we present OpenSN, i.e., an open source library for emulating large-scale satellite network (SN). Different from Mininet-based SN emulators (e.g., LeoEM), OpenSN adopts container-based virtualization, thus allows for running distributed routing software on each node, and can achieve horizontal scalability via flexible multi-machine extension. Compared to other container-based SN emulators (e.g., StarryNet), OpenSN streamlines the interaction with Docker command line interface and significantly reduces unnecessary operations of creating virtual links. These modifications improve emulation efficiency and vertical scalability on a single machine. Furthermore, OpenSN separates user-defined configuration from container network management via a Key-Value Database that records the necessary information for SN emulation. Such a separation architecture enhances the function extensibility. To sum up, OpenSN exhibits advantages in efficiency, scalability, and extensibility, thus is a valuable open source library that empowers research on LEO satellite networking. Experiment results show that OpenSN constructs mega-constellations 5X-10X faster than StarryNet, and updates link state 2X-4X faster than LeoEM. We also verify the scalability of OpenSN by successfully emulating the five-shell Starlink constellation with a total of 4408 satellites.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 8","pages":"1574-1590"},"PeriodicalIF":5.6,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144323115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedCSpc: A Cross-Silo Federated Learning System With Error-Bounded Lossy Parameter Compression FedCSpc:具有误差有界有损参数压缩的跨筒仓联邦学习系统
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-28 DOI: 10.1109/TPDS.2025.3564736
Zhaorui Zhang;Sheng Di;Kai Zhao;Sian Jin;Dingwen Tao;Zhuoran Ji;Benben Liu;Khalid Ayed Alharthi;Jiannong Cao;Franck Cappello
{"title":"FedCSpc: A Cross-Silo Federated Learning System With Error-Bounded Lossy Parameter Compression","authors":"Zhaorui Zhang;Sheng Di;Kai Zhao;Sian Jin;Dingwen Tao;Zhuoran Ji;Benben Liu;Khalid Ayed Alharthi;Jiannong Cao;Franck Cappello","doi":"10.1109/TPDS.2025.3564736","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3564736","url":null,"abstract":"Cross-Silo federated learning is widely used for scaling deep neural network (DNN) training over data silos from different locations worldwide while guaranteeing data privacy. Communication has been identified as the main bottleneck when training large-scale models due to large-volume model parameters and gradient transmission across public networks with limited bandwidth. Most previous works focus on gradient compression, while limited work tries to compress parameters that can not be ignored and extremely affect communication performance during the training. To bridge this gap, we propose <italic>FedCSpc:</i> an efficient cross-silo federated learning system with an XAI-driven adaptive parameter compression strategy for large-scale model training. Our work substantially differs from existing gradient compression techniques due to the distinct data features of gradient and parameter. The key contributions of this paper are fourfold. (1) Our designed <italic>FedCSpc</i> proposes to compress the parameter during the training using the state-of-the-art error-bounded lossy compressor – SZ3. (2) We develop an adaptive compression error bound adjustment algorithm to guarantee the model accuracy effectively. (3) We exploit an efficient approach to utilize the idle CPU resources of clients to compress the parameters. (4) We perform a comprehensive evaluation with a wide range of models and benchmarks on a GPU cluster with 65 GPUs. Results show that <italic>FedCSpc</i> can achieve the same model accuracy as FedAvg while reducing the data volume of parameters and gradients in communication by up to 7.39× and 288×, respectively. With 32 clients on a 4 Gb size model, <italic>FedCSpc</i> significantly outperforms FedAvg in wall-clock time in the emulated WAN environment (at the bandwidth of 1 Gbps or lower without loss of generality).","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 7","pages":"1372-1386"},"PeriodicalIF":5.6,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144100075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Greedy Algorithms for Steiner Forest Steiner森林的并行贪心算法
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-24 DOI: 10.1109/TPDS.2025.3563849
Laleh Ghalami;Daniel Grosu
{"title":"Parallel Greedy Algorithms for Steiner Forest","authors":"Laleh Ghalami;Daniel Grosu","doi":"10.1109/TPDS.2025.3563849","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3563849","url":null,"abstract":"The Steiner Forest Problem is a fundamental combinatorial optimization problem in operations research and computer science. Given an undirected graph with non-negative weights for edges and a set of pairs of vertices called terminals, the Steiner Forest Problem is to find the minimum cost subgraph that connects each of the terminal pairs together. We design a family of parallel greedy algorithms based on a sequential heuristic greedy algorithm called Paired Greedy, which iteratively connects the terminal pairs that have the minimum distance. The family of parallel algorithms consists of a set of algorithms exhibiting various degrees of parallelism determined by the number of pairs that are connected in parallel in each iteration of the algorithms. We implement and run the algorithms on a multi-core system and perform an extensive experimental analysis. We analyzed the performance of the algorithms on a rich library of Steiner Forest instances with various underlying graph types. The results show that our proposed parallel algorithms achieve significant speedup with respect to the sequential Paired Greedy algorithm and provide solutions with costs that are very close to those of the solutions obtained by the sequential Paired Greedy algorithm. We provide recommendation on selecting the type of parallel algorithm and its parameters in order to achieve the most efficient results for each class of instances.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 6","pages":"1311-1325"},"PeriodicalIF":5.6,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143929668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beehive: Decentralised High-Frequency Small Tasks Scheduling in Large Clusters Beehive:大型集群中分散的高频小任务调度
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-22 DOI: 10.1109/TPDS.2025.3563457
Yuxia Cheng;Linfeng Xu;Tongkai Yang;Wei Wu;Zhiqiang Lin;Antong Yu;Wenzhi Chen
{"title":"Beehive: Decentralised High-Frequency Small Tasks Scheduling in Large Clusters","authors":"Yuxia Cheng;Linfeng Xu;Tongkai Yang;Wei Wu;Zhiqiang Lin;Antong Yu;Wenzhi Chen","doi":"10.1109/TPDS.2025.3563457","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3563457","url":null,"abstract":"Data centers struggle with growing cluster sizes and rising submissions of short-lived, high-frequency tasks that cause performance bottlenecks in task scheduling. Existing centralized and distributed scheduling systems fall short in meeting performance requirements due to computational overload on the scheduler, cluster state management overhead, and scheduling conflicts. To address these challenges, this article introduces Beehive, a novel lightweight decentralized scheduling framework. In Beehive, each cluster node can schedule tasks within its local neighborhood, effectively reducing resource management overhead and scheduling conflicts. Moreover, all nodes are interconnected in a small-world network, an efficient structure that allows tasks to access resources across the entire cluster through global routing. This lightweight design enables Beehive to scale efficiently, supporting over 10,000 nodes and up to 80,000 task submissions per second without causing single-node scheduling bottlenecks. Experimental results demonstrate that Beehive significantly reduces scheduling latency. Specifically, 99% of tasks are scheduled within 100 milliseconds, and scheduling throughput can increase linearly with the number of nodes. Compared to existing centralized and distributed scheduling frameworks, Beehive substantially alleviates scheduling bottlenecks, particularly for high-frequency, short-lived tasks.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 6","pages":"1326-1337"},"PeriodicalIF":5.6,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143925044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$AWB^+$AWB+-$Tree$Tree: A Novel Width-Based Index Structure Supporting Hybrid Matching for Large-Scale Content-Based Pub/Sub Systems $AWB^+$AWB+-$Tree:一种支持大规模基于内容的Pub/Sub系统混合匹配的基于宽度的索引结构
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-16 DOI: 10.1109/TPDS.2025.3561714
Zhengyu Liao;Shiyou Qian;Zhonglong Zheng;Jian Cao;Guangtao Xue;Minglu Li
{"title":"$AWB^+$AWB+-$Tree$Tree: A Novel Width-Based Index Structure Supporting Hybrid Matching for Large-Scale Content-Based Pub/Sub Systems","authors":"Zhengyu Liao;Shiyou Qian;Zhonglong Zheng;Jian Cao;Guangtao Xue;Minglu Li","doi":"10.1109/TPDS.2025.3561714","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3561714","url":null,"abstract":"Event matching is a key component in a large-scale content-based publish/subscribe system. The performance of most existing algorithms is easily affected by the subscription matching probability. In this article, we propose a new data structure, named <inline-formula><tex-math>$AWB^+$</tex-math></inline-formula>-<inline-formula><tex-math>$Tree$</tex-math></inline-formula>, which is based on the width of the predicates, to efficiently index the subscriptions. The most notable feature of <inline-formula><tex-math>$AWB^+$</tex-math></inline-formula>-<inline-formula><tex-math>$Tree$</tex-math></inline-formula> is its ability to combine the advantages of different matching methods, thus achieving high and robust performance in dynamic environments. First, we implement both a forward matching method (AFM) and a backward matching method (ABM) based on <inline-formula><tex-math>$AWB^+$</tex-math></inline-formula>-<inline-formula><tex-math>$Tree$</tex-math></inline-formula>. Then, we introduce a hybrid matching method (AHM) that combines AFM and ABM. Moreover, we extend <inline-formula><tex-math>$AWB^+$</tex-math></inline-formula>-<inline-formula><tex-math>$Tree$</tex-math></inline-formula> in three aspects: approximate matching, string type matching, and fine-grained parallelization. We conducted extensive experiments to evaluate the performance of the proposed matching algorithms on synthetic and real-world datasets. The experiment results reveal that AHM achieves a reduction in matching time by up to 53.8% compared to the state-of-the-art method. Additionally, AHM exhibits improved performance robustness, with up to a 76.9% reduction in terms of the standard deviation of matching time. Particularly in dynamic scenarios, AHM is at least 2.3 times faster and 41.3% more stable than its counterparts. Furthermore, by implementing parallelization, the matching speed of 8 threads can be accelerated by 4.16 times compared to the single-thread matching speed.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 6","pages":"1268-1281"},"PeriodicalIF":5.6,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Raccoon: Lightweight Support for Comprehensive Control Flows in Reconfigurable Spatial Architectures 浣熊:在可重构空间架构中对综合控制流的轻量级支持
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-15 DOI: 10.1109/TPDS.2025.3561145
Xiangyu Kong;Yi Huang;Longlong Chen;Jianfeng Zhu;Liangwei Li;Xingchen Man;Mingyu Gao;Shaojun Wei;Leibo Liu
{"title":"Raccoon: Lightweight Support for Comprehensive Control Flows in Reconfigurable Spatial Architectures","authors":"Xiangyu Kong;Yi Huang;Longlong Chen;Jianfeng Zhu;Liangwei Li;Xingchen Man;Mingyu Gao;Shaojun Wei;Leibo Liu","doi":"10.1109/TPDS.2025.3561145","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3561145","url":null,"abstract":"Coarse-grained reconfigurable arrays (CGRAs) have emerged as promising candidates for digital signal processing, biomedical, and automotive applications, where energy efficiency and flexibility are paramount. Yet existing CGRAs suffer from the Amdahl bottleneck caused by constrained control handling via either off-device communication or expensive tag-matching mechanisms. More importantly, mapping control flow onto CGRAs is extremely arduous and time-consuming due to intricate instruction structures and hardware mechanisms. To counteract these limitations, we propose Raccoon, a portable and lightweight framework for CGRAs targeting vast control flows. Raccoon comprises a comprehensive approach that spans microarchitecture, HW/SW interface, and compiler aspects. Regarding microarchitecture, Raccoon incorporates specialized infrastructure for branch- and loop-level control patterns with concise execution mechanisms. The HW/SW interface of Raccoon includes well-characterized abstractions and instruction sets tailored for easy compilation, featuring custom operators and architectural models for control-oriented units. On the compiler front, Raccoon integrates advanced control handling techniques and employs a portable mapper leveraging reinforcement learning and Monte Carlo tree search. This enables agile mapping and optimization of the entire program, ensuring efficient execution and high-quality results. Through the cohesive co-design, Raccoon can empower various CGRAs with robust control-flow handling capabilities, surpassing conventional tagged mechanisms in terms of hardware efficiency and compiler adaptability. Evaluation results show that Raccoon achieves up to a 5.78× improvement in energy efficiency and a 2.24× reduction in cycle count over state-of-the-art CGRAs. Raccoon stands out for its versatility in managing intricate control flows and showcases remarkable portability across diverse CGRA architectures.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 6","pages":"1294-1310"},"PeriodicalIF":5.6,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CausalConf: Datasize-Aware Configuration Auto-Tuning for Recurring Big Data Processing Jobs via Adaptive Causal Structure Learning CausalConf:通过自适应因果结构学习,为重复出现的大数据处理作业提供数据感知配置自动调优
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-15 DOI: 10.1109/TPDS.2025.3560304
Hui Dou;Mingjie He;Lei Zhang;Yiwen Zhang;Zibin Zheng
{"title":"CausalConf: Datasize-Aware Configuration Auto-Tuning for Recurring Big Data Processing Jobs via Adaptive Causal Structure Learning","authors":"Hui Dou;Mingjie He;Lei Zhang;Yiwen Zhang;Zibin Zheng","doi":"10.1109/TPDS.2025.3560304","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3560304","url":null,"abstract":"To ensure high-performance processing capabilities across diverse application scenarios, Big Data frameworks such as Spark and Flink usually provide a number of performance-related parameters to configure. Considering the computation scale and the characteristic of repeated executions of typical recurring Big Data processing jobs, how to automatically tune parameters for performance optimization has emerged as a hot research topic in both academic and industry. With the advantages in interpretability and generalization ability, causal inference-based methods recently prove their advancement over conventional search-based and machine learning-based methods. However, the complexity of Big Data frameworks, the time-varying input dataset size of a recurring job and the limitation of a single causal structure learning algorithm together prevent these methods from practical application. Therefore, in this paper, we design and implement CausalConf, a datasize-aware configuration auto-tuning approach for recurring Big Data processing jobs via adaptive causal structure learning. Specifically, the offline training phase is responsible for training multiple datasize-aware causal structure models with different causal structure learning algorithms, while the online tuning phase is responsible for recommending the next promising configuration in an iterative manner via the Multi-Armed Bandit-based optimal intervention set selection as well as the novel datasize-aware causal Bayesian optimization. To evaluate the performance of CausalConf, a series of experiments are conducted on our local Spark cluster with 9 different previously unknown target applications from HiBench. Experimental results show that the performance speed ratio achieved by CausalConf compared to the four recent and representative baselines can respectively reach 1.45×, 1.31×, 1.26× and 1.54× on average and up to 2.53×, 1.55×, 1.57×, 2.18×. Besides, the average total online tuning cost of CausalConf is reduced by 8.85%, 14.26%, 18.58%, and 14.29%, respectively.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 7","pages":"1354-1371"},"PeriodicalIF":5.6,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144100035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Lightweight and Fine-Grained Ciphertext Search Scheme for Big Data Assisted by Proxy Servers 代理服务器辅助下的大数据轻量级细粒度密文搜索方案
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-04-14 DOI: 10.1109/TPDS.2025.3560694
Na Wang;Kaifa Zheng;Wen Zhou;Jianwei Liu;Lunzhi Deng;Junsong Fu
{"title":"A Lightweight and Fine-Grained Ciphertext Search Scheme for Big Data Assisted by Proxy Servers","authors":"Na Wang;Kaifa Zheng;Wen Zhou;Jianwei Liu;Lunzhi Deng;Junsong Fu","doi":"10.1109/TPDS.2025.3560694","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3560694","url":null,"abstract":"In Big Data scenarios, the data volume is enormous. Data computation and storage in distributed manner with more efficient algorithms is promising. However, most current ciphertext search schemes are designed for the centralized cloud computing platforms and they are inefficient and inapplicable in Big Data scenarios. A proxy server based system is a cloud computing extension. This new pattern moves some of the data storage and computation burden from end users to the edge servers and it greatly decrease the resource costs of data users. In this paper, we propose a searchable encryption scheme assisted by cloud computing and proxy servers for Big Data, which can accomplish Lightweight Fine-grained access control and Efficient multi-keyword top-k ciphertext Search synchronously (LFES). To cope with all types of data, we design an innovative fine-grained access control mechanism based on attribute-based encryption and key distribution protocol. Thus, the scheme only allows users with licensed attributes to access data efficiently. Then, a public key searchable encryption scheme is proposed based on privacy Protection Set Intersection (PSI) and the proxy server model. Our scheme greatly reduces the computation burden on end-users and improves retrieval efficiency. Meanwhile, to prevent tampering with stored ciphertexts, a practical data integrity audit mechanism is also designed. Security analysis illustrates that the LFES can resist Chosen Keyword Attack (CKA) and Keyword Guessing Attack (KGA). Finally, the simulation shows that the LFES is efficient and feasible in practice.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 7","pages":"1460-1477"},"PeriodicalIF":5.6,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信