{"title":"BRFL: A blockchain-based byzantine-robust federated learning model","authors":"Yang Li , Chunhe Xia , Chang Li , Tianbo Wang","doi":"10.1016/j.jpdc.2024.104995","DOIUrl":"10.1016/j.jpdc.2024.104995","url":null,"abstract":"<div><div>With the increasing importance of machine learning, the privacy and security of training data have become a concern. Federated learning, which stores data in distributed nodes and shares only model parameters, has gained significant attention for addressing this concern. However, a challenge arises in federated learning due to the byzantine attack problem, where malicious local models can compromise the global model's performance during aggregation. This article proposes the <u>B</u>lockchain-based Byzantine-<u>R</u>obust <u>F</u>ederated <u>L</u>earning (BRFL) model, which combines federated learning with blockchain technology. We improve the robustness of federated learning by proposing a new consensus algorithm and aggregation algorithm for blockchain-based federated learning. Meanwhile, we modify the block saving rules of the blockchain to reduce the storage pressure of the nodes. Experimental results on public datasets demonstrate the superior byzantine robustness of our secure aggregation algorithm compared to other baseline aggregation methods, and reduce the storage pressure of the blockchain nodes.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142445427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lightweight RDMA connection protocol based on post-hoc confirmation","authors":"Ke Wu, Dezun Dong, Weixia Xu","doi":"10.1016/j.jpdc.2024.104991","DOIUrl":"10.1016/j.jpdc.2024.104991","url":null,"abstract":"<div><div>With the increasing scale and complexity of high-performance computing systems, the rising failure rate poses significant challenges for RDMA networks that aim for high bandwidth and low latency. RDMA networks require hardware-level end-to-end reliable data transmission services to avoid the high cost of software failure recovery. Tianhe HPC interconnection network adopts a NIC-based RDMA reliable connection protocol, RCP. RCP establishes a connection for each message that enters the NIC and releases it after the transmission is complete. However, this introduces an additional round-trip time RTT connection overhead for each message, which severely impacts the performance of networks dominated by short messages in high-performance computing systems. We have found that utilization of receiver-side connection resources has been consistently low because maintaining message-grained connections on the NIC results in rapid release of connections. Therefore, we propose a lightweight RDMA connection protocol based on post-hoc confirmation, PCP. PCP assumes the receiver has connection resources by default and eliminates the need for confirmation from the receiver before sending a message, thus reducing the connection overhead of almost all messages by one RTT. At the same time, PCP also includes mechanisms to address the special case where the receiver lacks connection resources. Evaluation results demonstrate that PCP significantly optimizes short messages and applications dominated by short messages. Moreover, PCP further reduces the usage of receiver-side connection resources. Additionally, PCP does not experience performance degradation even under large-scale heavy loads and severe endpoint congestion.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142427635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SpEpistasis: A sparse approach for three-way epistasis detection","authors":"Diogo Marques, Leonel Sousa, Aleksandar Ilic","doi":"10.1016/j.jpdc.2024.104989","DOIUrl":"10.1016/j.jpdc.2024.104989","url":null,"abstract":"<div><div>Epistasis detection is a fundamental application in the areas of bioinformatics and biomedicine, providing important insights regarding the relationship between the human genome and the occurrence of certain diseases. Exhaustive epistasis detection approaches are employed to achieve an accurate and deterministic solution, at the cost of high computational complexity, especially when targeting high-order epistasis. While recent works employ vectorization and cache-blocking techniques to alleviate this burden, these solutions are now limited by the maximum performance of the functional units of computing systems. Thus, to further improve the performance of epistasis detection it is necessary to reduce its number of memory transfers and computations. To tackle this issue, this work proposes SpEpistasis, which performs three-way epistasis detection by relying on sparse features, which by only storing the non-zero elements of the dataset, allows for reducing the number of operations needed for epistasis detection. To achieve this goal, a new hybrid format to represent the input dataset is proposed, which stores a subset of the data in the compressed sparse row format. Moreover, new sparse-aware algorithmic approaches are also proposed in order to leverage both the hybrid format and the vector capabilities of current CPUs from Intel, AMD, and ARM. The experimental results show that SpEpistasis provides a speedup up to 3.7× and average speedups of around 1.8× and 1.33× when compared with other state-of-the-art works.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust and Scalable Federated Learning Framework for Client Data Heterogeneity Based on Optimal Clustering","authors":"Zihan Li , Shuai Yuan , Zhitao Guan","doi":"10.1016/j.jpdc.2024.104990","DOIUrl":"10.1016/j.jpdc.2024.104990","url":null,"abstract":"<div><div>Federated learning is a promising paradigm for applications across a variety of domains. However, there are some challenges that must be addressed in real-world scenarios, particularly the data heterogeneity among participating clients. Most existing studies primarily focus on the issue of non-independent and identically distributed data, but they do not consider the critical aspect of data quality heterogeneity. When low-quality data is contributed by some clients, the efficacy of models trained through the traditional approaches will be significantly compromised. Therefore, we propose ROSCFL, a robust and scalable federated learning framework for client data heterogeneity based on optimal clustering. We first develop a cluster contribution evaluation strategy based on the optimal clustering to quantify the contribution of each cluster. Next, we design a robust model aggregation strategy, which effectively mitigates the impact of low-quality data on the global model by optimizing weight allocation and client sampling. Finally, we introduce a client incorporation mechanism to enhance the scalability of ROSCFL. Extensive experiments have been conducted, and the results demonstrate that ROSCFL achieves strong robustness and scalability, particularly in scenarios wherein data distribution and quality heterogeneity coexist.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142327797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(24)00146-1","DOIUrl":"10.1016/S0743-7315(24)00146-1","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524001461/pdfft?md5=4b65d789bc9db964e4fbb6b24c70b8aa&pid=1-s2.0-S0743731524001461-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Zhang , Junwei Ye , Wei Huang , Ximeng Liu , Jason Gu
{"title":"Survey of federated learning in intrusion detection","authors":"Hao Zhang , Junwei Ye , Wei Huang , Ximeng Liu , Jason Gu","doi":"10.1016/j.jpdc.2024.104976","DOIUrl":"10.1016/j.jpdc.2024.104976","url":null,"abstract":"<div><p>Intrusion detection methods are crucial means to mitigate network security issues. However, the challenges posed by large-scale complex network environments include local information islands, regional privacy leaks, communication burdens, difficulties in handling heterogeneous data, and storage resource bottlenecks. Federated learning has the potential to address these challenges by leveraging widely distributed and heterogeneous data, achieving load balancing of storage and computing resources across multiple nodes, and reducing the risks of privacy leaks and bandwidth resource demands. This paper reviews the process of constructing federated learning based intrusion detection system from the perspective of intrusion detection. Specifically, it outlines six main aspects: application scenario analysis, federated learning methods, privacy and security protection, selection of classification models, data sources and client data distribution, and evaluation metrics, establishing them as key research content. Subsequently, six research topics are extracted based on these aspects. These topics include expanding application scenarios, enhancing aggregation algorithm, enhancing security, enhancing classification models, personalizing model and utilizing unlabeled data. Furthermore, the paper delves into research content related to each of these topics through in-depth investigation and analysis. Finally, the paper discusses the current challenges faced by research, and suggests promising directions for future exploration.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142271035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The analysis of P2P networks with malicious peers and repairable breakdown based on Geo/Geo/1+1 queue","authors":"Ying Shen, Zhanyou Ma","doi":"10.1016/j.jpdc.2024.104979","DOIUrl":"10.1016/j.jpdc.2024.104979","url":null,"abstract":"<div><p>The incredible growth of Peer-to-Peer (P2P) networks has brought with it some complex challenges, such as trust issues and high bandwidth consumption. To address these challenges, this paper analyzes the “free-riding” behavior, system energy consumption, and the benefits of requesting and service peers in the network. A Geo/Geo/1+1 queuing model is built with malicious peers which includes several strategies such as repairable breakdown, synchronized multiple working vacations, differentiated service, and waiting threshold. The matrix-geometric solution method is used to obtain steady-state distribution and performance measures. By conducting numerical experiments and analyzing the impact of each parameter, it is possible to optimize the system's performance and reduce energy consumption. With careful adjustments to parameter values, significant cost savings of requesting peers and energy conservation can be achieved. The resulting analysis provides a comprehensive understanding of the behavior of P2P networks, and the strategies proposed in the study can be used to optimize the performance of P2P networks.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142242365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Wang , Yichen Cai , Yu Tao , Luyao Wang , Yanbin Li , Lu Zhou
{"title":"B2DFL: Bringing butterfly to decentralized federated learning assisted with blockchain","authors":"Hao Wang , Yichen Cai , Yu Tao , Luyao Wang , Yanbin Li , Lu Zhou","doi":"10.1016/j.jpdc.2024.104978","DOIUrl":"10.1016/j.jpdc.2024.104978","url":null,"abstract":"<div><p>We propose a novel decentralized federated learning framework called B2DFL. It decomposes the aggregation process of vanilla FL into layered and serialized sub-aggregation processes and offloads the communication and computation from a single point to distributed nodes, thus addressing the single point of failure issue in centralized FL. The decentralization of B2DFL is based on the Butterfly, a distributed network topology, to organize and orchestrate the order and rules of node aggregation. Additionally, to mitigate potential risks such as dropouts or tampering, we leverage the blockchain and IPFS systems. Specifically, after each node completes its computation (including training and aggregation), it generates a hash value of the results as proof. We maintain a Tamper-evident Data Structure (TDS) on the blockchain, which records these proofs to ensure tamper-proofing and fast verification. To reduce the storage burden on the blockchain and improve throughput, we store the aggregated results on IPFS, a system that enables quick data location through hash values of data, for data backup. We also design a node replacement mechanism for quick dropout handling. We conduct a comprehensive performance evaluation and experimental results demonstrate that B2DFL presents a significant performance improvement while achieving privacy and decentralization.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142242362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James McKevitt , Eduard I. Vorobyov , Igor Kulikov
{"title":"Accelerating Fortran codes: A method for integrating Coarray Fortran with CUDA Fortran and OpenMP","authors":"James McKevitt , Eduard I. Vorobyov , Igor Kulikov","doi":"10.1016/j.jpdc.2024.104977","DOIUrl":"10.1016/j.jpdc.2024.104977","url":null,"abstract":"<div><p>Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the development of new high-performance codes. Coarray Fortran (CAF), part of the Fortran 2008 standard introduced for parallel programming, facilitates distributed memory parallelism with a syntax familiar to Fortran programmers, simplifying the transition from single-processor to multi-processor coding. This research focuses on innovating and refining a parallel programming methodology that fuses the strengths of Intel Coarray Fortran, Nvidia CUDA Fortran, and OpenMP for distributed memory parallelism, high-speed GPU acceleration and shared memory parallelism respectively. We consider the management of pageable and pinned memory, CPU-GPU affinity in NUMA multiprocessors, and robust compiler interfacing with speed optimisation. We demonstrate our method through its application to a parallelised Poisson solver and compare the methodology, implementation, and scaling performance to that of the Message Passing Interface (MPI), finding CAF offers similar speeds with easier implementation. For new codes, this approach offers a faster route to optimised parallel computing. For legacy codes, it eases the transition to parallel computing, allowing their transformation into scalable, high-performance computing applications without the need for extensive re-design or additional syntax.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524001412/pdfft?md5=69e1ea2ba9c62d46ed1506e701029846&pid=1-s2.0-S0743731524001412-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142172595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Front Matter 1 - Full Title Page (regular issues)/Special Issue Title page (special issues)","authors":"","doi":"10.1016/S0743-7315(24)00136-9","DOIUrl":"10.1016/S0743-7315(24)00136-9","url":null,"abstract":"","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524001369/pdfft?md5=dfe2623c0180f0c77ae8f5870a3416cc&pid=1-s2.0-S0743731524001369-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}