Journal of Parallel and Distributed Computing最新文献

筛选
英文 中文
On the development of high-performance, multi-GPU applications on heterogeneous systems leveraging SYCL 在利用SYCL的异构系统上开发高性能、多gpu应用程序
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-10-24 DOI: 10.1016/j.jpdc.2025.105188
Francisco J. Andújar , Rocío Carratalá-Sáez , Yuri Torres , Arturo Gonzalez-Escribano , Diego R. Llanos
{"title":"On the development of high-performance, multi-GPU applications on heterogeneous systems leveraging SYCL","authors":"Francisco J. Andújar ,&nbsp;Rocío Carratalá-Sáez ,&nbsp;Yuri Torres ,&nbsp;Arturo Gonzalez-Escribano ,&nbsp;Diego R. Llanos","doi":"10.1016/j.jpdc.2025.105188","DOIUrl":"10.1016/j.jpdc.2025.105188","url":null,"abstract":"<div><div>Computational platforms for high-performance scientific applications are increasingly heterogeneous, incorporating multiple GPU accelerators. However, differences in GPU vendors, architectures, and programming models challenge performance portability and ease of development. SYCL provides a unified programming approach, enabling applications to target NVIDIA and AMD GPUs simultaneously while offering higher-level abstractions for data and task management. This paper evaluates SYCL’s performance and development effort using the Finite Time Lyapunov Exponent (FTLE) calculation as a case study. We compare SYCL’s AdaptiveCpp (Ahead-Of-Time and Just-In-Time) and Intel oneAPI compilers, along with different data management strategies (Unified Shared Memory and buffers), against equivalent CUDA and HIP implementations. Our analysis considers single and multi-GPU execution, including heterogeneous setups with GPUs from different vendors. Results show that, while SYCL introduces additional development effort compared to native CUDA and HIP implementations, it enables multi-vendor portability with minimal performance overhead when using specific design options. Based on our findings, we provide development guidelines to help programmers decide when to use SYCL versus vendor-specific alternatives.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105188"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145467079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A General-Purpose K-Nearest Neighbor Method with an Efficient Pruning Strategy for GPUs 基于高效修剪策略的通用k近邻算法
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-10-20 DOI: 10.1016/j.jpdc.2025.105187
Jue Wang , Fumihiko Ino
{"title":"A General-Purpose K-Nearest Neighbor Method with an Efficient Pruning Strategy for GPUs","authors":"Jue Wang ,&nbsp;Fumihiko Ino","doi":"10.1016/j.jpdc.2025.105187","DOIUrl":"10.1016/j.jpdc.2025.105187","url":null,"abstract":"<div><div><span><math><mi>K</mi></math></span>-nearest neighbor (<span><math><mi>k</mi></math></span>NN) search is widely applied to low- and high-dimensional tasks, as well as various data distributions and distance functions. However, its computational cost increases with the data volume, causing a bottleneck for many applications. The workload of the existing tree-based methods linearly increases with the neighbor count <span><math><mi>k</mi></math></span> in the worst case. In addition, some tree-based methods only apply to tasks with L2 distances and may have severe warp divergence when employed on GPUs. Our goal is to develop a general-purpose <span><math><mi>k</mi></math></span>NN method based on cluster sorting to achieve better pruning efficiency compared with tree-based approaches. We optimize the proposed method to achieve higher performance on tasks with different dimensionalities or distance functions. The proposed Sort, TraversE, and then Prune (STEP) algorithm is a <span><math><mi>k</mi></math></span>NN method that clusters the data points beforehand. With various 1) numbers of data points, 2) numbers of query points, 3) neighbor counts, 4) dimensions, and 5) distance metrics, the STEP method offers high performance because of the following aspects. First, our method prunes the data points efficiently by sorting the clusters for each query. Second, we exploit the single-instruction multiple-threads (SIMT) architecture of the GPU and utilize both coarse- and fine-grained parallelism to accelerate computation. The proposed method concurrently computes all queries and minimizes warp divergence by assigning a query to a GPU warp. Third, the STEP method rapidly updates the <span><math><mi>k</mi></math></span>NN results using bitonic operations. Fourth, we proposed an adaptive approach that automatically switches from the indexing approach to the exhaustive approach to achieve good scalability on high-dimensional data. Finally, we develop a variant of Gärtner’s bounding sphere algorithm so that our indexing method can handle distance metrics other than the L2 distance. The STEP method achieves a 15.9 times speedup with L2 distances and a 36.7 times speedup with angular distances compared with other state-of-the-art methods.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105187"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145417607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scalable tensor-based MDTW approach for multi-modal time series patterns clustering 多模态时间序列模式聚类的基于可伸缩张量的MDTW方法
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-09-03 DOI: 10.1016/j.jpdc.2025.105173
Bahati Alam Sanga , Laurence T. Yang , Shunli Zhang , Zecan Yang , Nicholaus Gati
{"title":"A scalable tensor-based MDTW approach for multi-modal time series patterns clustering","authors":"Bahati Alam Sanga ,&nbsp;Laurence T. Yang ,&nbsp;Shunli Zhang ,&nbsp;Zecan Yang ,&nbsp;Nicholaus Gati","doi":"10.1016/j.jpdc.2025.105173","DOIUrl":"10.1016/j.jpdc.2025.105173","url":null,"abstract":"<div><div>Multi-modal Time Series (MTS) is a vital ingredient to Predictive Multi-modal Artificial Intelligence (PMAI). MTS systems capture varying temporal modalities and their inherent dependencies for their accurate analytics. However, efficiently exploring these cross-modalities relationships is a challenging research due to their complexity facets and information redundancies. MTS patterns' pairwise similarity measures precede PMAI. Multi-modal Dynamic Time Warping (MDTW) is frequently explored to quantify similar MTS. Yet, it's reliant on the orthogonal conditioned local similarity measures that ignore the contributions of MTS' underlying structural relationships in the warping process and, hence, susceptible to unrealistic matching. This paper addresses the setbacks by recommending a scalable MTS recognition model, named Tensor-Slices Distance (TSD)-based MDTW (TSD-MDTW), that's subsequently advanced to two more distinct models termed Weighted modality and TSD (WmTSD-MDTW) and TSD-Mahalanobis (TSDMaha-MDTW). To quantify an alignment's cost, TSD-MDTW incorporates intrinsic spatial dependencies between modalities' coordinates, while WmTSD-MDTW relaxes information redundancies through weighing modalities based on information richness, whereas TSDMaha-MDTW embodies modalities dependencies and their coordinates' innate spatial dependencies. Besides, it proposes a scalable Tensor-based DTW (TDTW) model that re-formulates MDTW into multiple dimensions that are found paralleling warping processes. Theoretical and empirical experimental results on MTS multi-modal datasets encompassing load patterns and meteorological modalities reveal TDTW's efficiency and proposals' superior performances in terms of cluster compactness and separation over MDTW employing the state-of-the-art local similarity measures.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105173"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145098698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security vulnerabilities and enhancement of a dynamic auditing scheme for regenerating code-based storage in cloud-fog-assisted IIoT 云雾辅助工业物联网中基于代码的存储再生的动态审计方案的安全漏洞和增强
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-10-14 DOI: 10.1016/j.jpdc.2025.105185
Guangjun Liu , Jinbo Xiong , Ximeng Liu , Xiang Zou , Chenghu Ke , Zengfa Dou
{"title":"Security vulnerabilities and enhancement of a dynamic auditing scheme for regenerating code-based storage in cloud-fog-assisted IIoT","authors":"Guangjun Liu ,&nbsp;Jinbo Xiong ,&nbsp;Ximeng Liu ,&nbsp;Xiang Zou ,&nbsp;Chenghu Ke ,&nbsp;Zengfa Dou","doi":"10.1016/j.jpdc.2025.105185","DOIUrl":"10.1016/j.jpdc.2025.105185","url":null,"abstract":"<div><div>In a recent publication, Liu et al. put forth a privacy-preserving dynamic auditing scheme for distributed encoded storage systems in cloud-fog-assisted Industrial Internet of Things (IIoT) [Internet of Things, DOI: 10.1016/j.iot.2024.101084]. Each encoded data segment utilizes the ZSS signature for the creation of its corresponding authentication tag. The fog server will be challenged and subjected to rigorous verification through the utilisation of a bilinear pairing map. In this paper, we demonstrate that the security vulnerabilities of Liu et al.’s scheme by mounting a block forgery attack and an identifier forgery attack, respectively. In particular, an adversarial fog server is capable of successfully deceiving the proxy auditor through the implementation of arbitrary unauthorised data tampering or identifier impersonation. We also provide an alternative scheme to address the security weaknesses, and highlight the challenges of cloud data auditing tailored for cloud fog-enabled IIoT.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105185"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145363964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lightweight fine-grained scheme for distinguishing the hotness of warm data to reduce segment cleaning overhead 一种轻量级的细粒度方案,用于区分热数据的热度,以减少段清理开销
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-10-10 DOI: 10.1016/j.jpdc.2025.105183
Lihua Yang , Yang Xiao , Zhipeng Tan , Fang Wang , Weizhao Lin , Wei Zhang , Jiaxin Li , Kai Lu
{"title":"A lightweight fine-grained scheme for distinguishing the hotness of warm data to reduce segment cleaning overhead","authors":"Lihua Yang ,&nbsp;Yang Xiao ,&nbsp;Zhipeng Tan ,&nbsp;Fang Wang ,&nbsp;Weizhao Lin ,&nbsp;Wei Zhang ,&nbsp;Jiaxin Li ,&nbsp;Kai Lu","doi":"10.1016/j.jpdc.2025.105183","DOIUrl":"10.1016/j.jpdc.2025.105183","url":null,"abstract":"<div><div>With the widespread adoption of flash memory, the Flash Friendly File System (F2FS) designed to flash memory characteristics has become widely-used in large data centers. However, F2FS encounters from significant cleaning overheads due to its logging scheme writes. We observe that warm data in F2FS account for a substantial proportion, at least 80 %. Nevertheless, the mixed storage of warm data with varying hotness exacerbates segment cleaning challenges. To address this issue, we propose a scheme called M2H, which involves a fine-grained management of warm data hotness identified by the K-means clustering algorithm. M2H determines hotness by considering factors such as file block update distance, most recently used distance, and workload characteristics. M2H facilitates <strong>M</strong>ulti-log delayed writing and <strong>M</strong>odified segment cleaning based on <strong>H</strong>otness. To reduce costs associated with distinguishing data hotness at the file block level, we employ Mini Batch K-means, which is referred to as HMBK. Moreover, for servers equipped with GPUs, the clustering process can be offloaded to the GPU, known as HGPU. We conduct a comprehensive comparison of traditional F2FS, M2H, HMBK, and HGPU on a real platform. Results show that compared to traditional F2FS, HGPU reduces the number of segment cleanings by 54.41 % to 97.93 %.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105183"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145417543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues) 封面1 -完整的扉页(每期)/特刊扉页(每期)
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2026-01-01 Epub Date: 2025-11-17 DOI: 10.1016/S0743-7315(25)00161-3
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(25)00161-3","DOIUrl":"10.1016/S0743-7315(25)00161-3","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"207 ","pages":"Article 105194"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SoRCS: A scalable blockchain model with separation of role, chain and storage SoRCS:具有角色、链和存储分离的可扩展区块链模型
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2025-12-01 Epub Date: 2025-08-05 DOI: 10.1016/j.jpdc.2025.105160
Bin Yu , Lei Chen , He Zhao , Zhiyu Ma , Haotian Cheng , Xiaoting Zhang , Liang Sun , Tong Zhou , Nianzu Sheng
{"title":"SoRCS: A scalable blockchain model with separation of role, chain and storage","authors":"Bin Yu ,&nbsp;Lei Chen ,&nbsp;He Zhao ,&nbsp;Zhiyu Ma ,&nbsp;Haotian Cheng ,&nbsp;Xiaoting Zhang ,&nbsp;Liang Sun ,&nbsp;Tong Zhou ,&nbsp;Nianzu Sheng","doi":"10.1016/j.jpdc.2025.105160","DOIUrl":"10.1016/j.jpdc.2025.105160","url":null,"abstract":"<div><div>The industrial use of blockchain technology is becoming more widespread, the scalability of blockchain is still one of the primary challenges in large-scale practical applications. Separation schemes are being introduced by many blockchain projects to solve their scalability problems. In this paper, we propose a comprehensive separation scheme SoRCS, which separates the node role, the chain, and the data storage. It makes full use of the resources of each node, reduces the load on the nodes, and improves the degree of decentralization. Ordering of verified transactions, execution of ordered transactions, confirmation of ordering and execution blocks run concurrently within different sub-networks to improve blockchain performance. Based on the results of the block consensus, we provide a three-phase response: documented, executed, and confirmed.</div><div>Based on the SoRCS architecture, we also implement a prototype system that consists of 1200 nodes to evaluate our separation schemes. Its peak throughput is 14.7 Ktps and its latency is around 0.5 s. We use the three-phase response time to avoid the issue of higher latency, and the first response time is around 0.15 s.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"206 ","pages":"Article 105160"},"PeriodicalIF":4.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144772622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scheduler to foster data locality for GPU and out-of-core task-based linear algebra applications 为GPU和核心外的基于任务的线性代数应用程序培育数据局部性的调度程序
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2025-12-01 Epub Date: 2025-08-18 DOI: 10.1016/j.jpdc.2025.105170
Maxime Gonthier , Loris Marchal , Samuel Thibault
{"title":"A scheduler to foster data locality for GPU and out-of-core task-based linear algebra applications","authors":"Maxime Gonthier ,&nbsp;Loris Marchal ,&nbsp;Samuel Thibault","doi":"10.1016/j.jpdc.2025.105170","DOIUrl":"10.1016/j.jpdc.2025.105170","url":null,"abstract":"<div><div>Hardware accelerators like GPUs now provide a large part of the computational power used for scientific simulations. Despite their efficacy, GPUs possess limited memory and are connected to the main memory of the machine via a bandwidth limited bus. Scientific simulations often operate on very large data, that surpasses the GPU's memory capacity. Therefore, one has to turn to <strong>out-of-core</strong> computing: data is kept in a remote, slower memory (CPU memory), and moved back and forth from/to the device memory (GPU memory), a process also present for multicore CPUs with limited memory. In both cases, data movement quickly becomes a performance bottleneck. Task-based runtime schedulers have emerged as a convenient and efficient way to manage large applications on such heterogeneous platforms. <strong>We propose a scheduler for task-based runtimes</strong> that improves <strong>data locality</strong> for out-of-core linear algebra computations, to reduce data movement. We design a data-aware strategy for both task scheduling and data eviction from limited memories. We compare this scheduler to existing schedulers in runtime systems. Using <span>StarPU</span>, we show that our new scheduling strategy achieves comparable performance when memory is not a constraint, and significantly better performance when application input data exceeds memory, on both GPUs and CPU cores.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"206 ","pages":"Article 105170"},"PeriodicalIF":4.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144866099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SEDViN: Secure embedding for dynamic virtual network requests using a multi-attribute matching game SEDViN:使用多属性匹配游戏安全嵌入动态虚拟网络请求
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2025-12-01 Epub Date: 2025-09-03 DOI: 10.1016/j.jpdc.2025.105171
T.G. Keerthan Kumar , Rahul Kumar , Anirudh Munnur Achal , Anurag Satpathy , Sourav Kanti Addya
{"title":"SEDViN: Secure embedding for dynamic virtual network requests using a multi-attribute matching game","authors":"T.G. Keerthan Kumar ,&nbsp;Rahul Kumar ,&nbsp;Anirudh Munnur Achal ,&nbsp;Anurag Satpathy ,&nbsp;Sourav Kanti Addya","doi":"10.1016/j.jpdc.2025.105171","DOIUrl":"10.1016/j.jpdc.2025.105171","url":null,"abstract":"<div><div>Network virtualization (NV) has gained significant attention as it allows service providers (SP) to share substrate network (SN) resources. It is achieved by partitioning them into isolated virtual network requests (VNRs) comprising interrelated virtual machines (VMs) and virtual links (VLs). Although NV provides various advantages, such as service separation, enhanced quality-of-service, reliability, and improved SN utilization, it also presents multiple scientific challenges. In this context, one pivotal challenge encountered by the researchers is secure virtual network embedding (SVNE). The SVNE encompasses assigning SN resources to components of VNR, i.e., VMs and VLs, adhering to the security demands, which is a computationally intractable problem, as it is proven to be <span><math><mi>NP</mi></math></span>-Hard. In this context, maximizing the acceptance and revenue-to-cost ratios remains of utmost priority for SPs as it not only increases the revenue but also effectively utilizes the large pool of SN resources. Though VNE is a well-researched problem, the existing literature has the following flaws: (<em>i</em>.) security features of VMs and VLs are ignored, (<em>ii</em>.) limited consideration of topological attributes, and (<em>iii</em>.) restricted to static VNRs. However, SPs need to develop an embedding framework that overcomes the abovementioned pitfalls. Therefore, this work proposes a framework <strong>S</strong>ecure <strong>E</strong>mbedding for <strong>D</strong>ynamic <strong>Vi</strong>rtual <strong>N</strong>etwork requests using a multi-attribute matching game (SEDViN). In SedViN, the deferred acceptance algorithm (DAA) based matching game is used for effective embedding. SEDViN operates primarily in two steps to obtain a secure embedding of dynamic VNRs. Firstly, it generates a unified ranking for VMs and servers using a combination of entropy and a technique for order of preference by similarity to the ideal solution (TOPSIS), considering network, security, and system attributes. Taking these as inputs, in the second step, VNR embedding is conducted using the deferred acceptance approach based on a one-to-many matching strategy for VM embedding and VL embedding using the shortest path algorithm. The performance of SEDViN is evaluated through simulations and compared against different baseline approaches. The simulation outcomes exhibit that SEDViN surpasses the baselines with a gain of 56% in the acceptance and 44% in the revenue-to-cost ratios.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"206 ","pages":"Article 105171"},"PeriodicalIF":4.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144996790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HeaPS: Heterogeneity-aware participant selection for efficient federated learning 高效联邦学习的异构感知参与者选择
IF 4 3区 计算机科学
Journal of Parallel and Distributed Computing Pub Date : 2025-12-01 Epub Date: 2025-08-19 DOI: 10.1016/j.jpdc.2025.105168
Duo Yang , Bing Hu , Yunqi Gao , A-Long Jin , An Liu , Kwan L. Yeung , Yang You
{"title":"HeaPS: Heterogeneity-aware participant selection for efficient federated learning","authors":"Duo Yang ,&nbsp;Bing Hu ,&nbsp;Yunqi Gao ,&nbsp;A-Long Jin ,&nbsp;An Liu ,&nbsp;Kwan L. Yeung ,&nbsp;Yang You","doi":"10.1016/j.jpdc.2025.105168","DOIUrl":"10.1016/j.jpdc.2025.105168","url":null,"abstract":"<div><div>Federated learning enables collaborative model training among numerous clients. However, existing participant/client selection methods fail to fully leverage the advantages of clients with excellent computational or communication capabilities. In this paper, we propose HeaPS, a novel Heterogeneity-aware Participant Selection framework for efficient federated learning. We introduce a finer-grained global selection algorithm to select communication-strong leaders and computation-strong members from candidate clients. The leaders are responsible for communicating with the server to reduce per-round duration, as well as contributing gradients; while the members communicate with the leaders to contribute more gradients obtained from high-utility data to the global model and improve the final model accuracy. Meanwhile, we develop a gradient migration path generation algorithm to match the optimal leader for each member. We also design the client scheduler to facilitate parallel local training of leaders and members based on gradient migration. Experimental results show that, in comparison with state-of-the-art methods, HeaPS achieves a speedup of up to 3.20× in time-to-accuracy performance and improves the final accuracy by up to 3.57%. The code for HeaPS is available at <span><span>https://github.com/Dora233/HeaPS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"206 ","pages":"Article 105168"},"PeriodicalIF":4.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144887402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书