IEEE Transactions on Parallel and Distributed Systems最新文献

筛选
英文 中文
A Comparative Study of Sampling Methods With Cross-Validation in the FedHome Framework FedHome框架中抽样方法与交叉验证的比较研究
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-15 DOI: 10.1109/TPDS.2025.3526238
Arash Ahmadi;Sarah S. Sharif;Yaser M. Banad
{"title":"A Comparative Study of Sampling Methods With Cross-Validation in the FedHome Framework","authors":"Arash Ahmadi;Sarah S. Sharif;Yaser M. Banad","doi":"10.1109/TPDS.2025.3526238","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526238","url":null,"abstract":"This article presents a comparative study of sampling methods within the FedHome framework, designed for personalized in-home health monitoring. FedHome leverages federated learning (FL) and generative convolutional autoencoders (GCAE) to train models on decentralized edge devices while prioritizing data privacy. A notable challenge in this domain is the class imbalance in health data, where critical events such as falls are underrepresented, adversely affecting model performance. To address this, the research evaluates six oversampling techniques using Stratified K-fold cross-validation: SMOTE, Borderline-SMOTE, Random OverSampler, SMOTE-Tomek, SVM-SMOTE, and SMOTE-ENN. These methods are tested on FedHome's public implementation over 200 training rounds with and without stratified K-fold cross-validation. The findings indicate that SMOTE-ENN achieves the most consistent test accuracy, with a standard deviation range of 0.0167–0.0176, demonstrating stable performance compared to other samplers. In contrast, SMOTE and SVM-SMOTE exhibit higher variability in performance, as reflected by their wider standard deviation ranges of 0.0157–0.0180 and 0.0155–0.0180, respectively. Similarly, the Random OverSampler method shows a significant deviation range of 0.0155–0.0176. SMOTE-Tomek, with a deviation range of 0.0160–0.0175, also shows greater stability but not as much as SMOTE-ENN. This finding highlights the potential of SMOTE-ENN to enhance the reliability and accuracy of personalized health monitoring systems within the FedHome framework.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"570-579"},"PeriodicalIF":5.6,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
$ mathsf{GPABE} $GPABE: GPU-Based Parallelization Framework for Attribute-Based Encryption Schemes $ mathsf{gabe} $ gabe:基于gpu的基于属性的加密方案并行化框架
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-13 DOI: 10.1109/TPDS.2025.3529776
Wenhan Xu;Hui Ma;Rui Zhang;Jianhao Li
{"title":"$ mathsf{GPABE} $GPABE: GPU-Based Parallelization Framework for Attribute-Based Encryption Schemes","authors":"Wenhan Xu;Hui Ma;Rui Zhang;Jianhao Li","doi":"10.1109/TPDS.2025.3529776","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3529776","url":null,"abstract":"Attribute-based encryption (ABE) has emerged as a new paradigm for access control in cloud computing. However, despite the many promising features of ABE, its deployment in real-world systems is still limited, partially due to the expensive cost of its underlying mathematical operations, which often grow linearly with the size and complexity of the system's security policies. This becomes particularly challenging in data-intensive applications, where multiple users may simultaneously access and manipulate large volumes of data, resulting in high levels of concurrency and demand for computing resources, which are too heavy even for high-end servers. Further exacerbating the issues are the functionality and security requirements of a cloud, as they introduce additional computations to both the client and the server. Therefore, in this work, we introduce <inline-formula><tex-math>$ mathsf{GPABE} $</tex-math></inline-formula>, the first GPU-based parallelization framework for ABE to facilitate its batch processing in cloud computing. By analyzing ABE's major computational workload, we identify multiple arithmetic modules that are common in the design of pairing-based ABEs. Based on the analysis, we further propose to decompose the ABE algorithm into computation graph, which can be efficiently implemented on the GPU platform. Our graph representation bridges the gap between ABE's high-level design and their low-level implementation on GPUs, and is applicable to a variety of popular schemes in the realm of ABE. We then implement <inline-formula><tex-math>$ mathsf{GPABE} $</tex-math></inline-formula> as a heterogeneous computing server, with several optimization techniques to improve its throughput. Finally, we evaluate the GPU implementation of several ABE schemes using <inline-formula><tex-math>$ mathsf{GPABE} $</tex-math></inline-formula>. The results show a speedup of least 51.0× and at most 253.6× for the throughput of ABE algorithms, compared to their state-of-the-art CPU implementations, which preliminarily demonstrated the effectiveness of <inline-formula><tex-math>$ mathsf{GPABE} $</tex-math></inline-formula>.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"520-536"},"PeriodicalIF":5.6,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chasing Common Knowledge: Joint Large Model Selection and Pulling in MEC With Parameter Sharing 追求共同知识:参数共享的MEC联合大模型选择与抽取
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-08 DOI: 10.1109/TPDS.2025.3527649
Lizhen Zhou;Zichuan Xu;Qiufen Xia;Zhou Xu;Wenhao Ren;Wenbo Qi;Jinjing Ma;Song Yan;Yuan Yang
{"title":"Chasing Common Knowledge: Joint Large Model Selection and Pulling in MEC With Parameter Sharing","authors":"Lizhen Zhou;Zichuan Xu;Qiufen Xia;Zhou Xu;Wenhao Ren;Wenbo Qi;Jinjing Ma;Song Yan;Yuan Yang","doi":"10.1109/TPDS.2025.3527649","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3527649","url":null,"abstract":"Pretrained Foundation Models (PFMs) are regarded as a promising accelerator for the development of various Artificial Intelligence (AI) applications, and have recently been widely fine-tuned to satisfy users’ personalized inference demands. As many users are attracted to PFM-based AI applications, remote data centers are increasingly unable to solely bear the enormous computational demands and meet the delay requirements of inference requests. Mobile edge computing (MEC) offers a viable solution for delivering low-latency inference services by pulling fine-tuned PFMs from the remote data center to cloudlets in the proximity of users. However, a fine-tuned PFM typically comprises billions of model parameters, which are highly resource-intensive, time-consuming, and cost-prohibitive to execute at the edge. To address this, we investigate a novel joint large model selection and pulling problem in MEC networks. The novelty of our study lies in exploring parameter sharing among fine-tuned PFMs based on their common knowledge. Specifically, we first formulate a Non-Linear Integer Programming (NLIP) for the problem to minimize the total delay of implementing all inference requests. We then transform the NLIP into an equivalent Integer Linear Program (ILP) that is much simpler to solve. We further propose a randomized algorithm with a provable approximation ratio for the problem. We also consider the online version of the problem with uncertain request demand, and develop an online learning algorithm with a bounded regret. The crux of the online algorithm is the adoption of the multi-armed bandit technique with restricted context for dynamic admissions of inference requests. We finally conduct extensive experiments based on real datasets. Experimental results demonstrate that our algorithms reduce at least 38% in total delays and average costs, while achieving a 5% improvement in average accuracies.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"437-454"},"PeriodicalIF":5.6,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143106669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2024 Reviewers List* 2024审稿人名单*
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-08 DOI: 10.1109/TPDS.2024.3512712
{"title":"2024 Reviewers List*","authors":"","doi":"10.1109/TPDS.2024.3512712","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3512712","url":null,"abstract":"","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 2","pages":"356-360"},"PeriodicalIF":5.6,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10834303","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142938326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Efficient Verifiable Cloud Storage and Distribution for Large-Scale Data Streaming 面向大规模数据流的高效可验证云存储和分发
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-07 DOI: 10.1109/TPDS.2025.3526642
Haining Yang;Dengguo Feng;Jing Qin
{"title":"Towards Efficient Verifiable Cloud Storage and Distribution for Large-Scale Data Streaming","authors":"Haining Yang;Dengguo Feng;Jing Qin","doi":"10.1109/TPDS.2025.3526642","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526642","url":null,"abstract":"Data streaming is an ordered sequence of data continuously generated over time, whose dynamic scale is hard to be predicated in advance. Since the traditional integrity verification primitives are not qualified to check the integrity of the retrieved data and the outsourced database in streaming setting, some specific schemes were proposed by adopting the tree-like authentication structure or the combination of signature and accumulator. However, these schemes are not optimal for the owner. The main concerns can be generalized as how to reduce the size of the authentication information to be less than the scale of the data streaming, and enable the resource-constrained owner to check the data integrity without using challenge. To address the problems, we intend to find a new approach to design the scheme by exploiting the novel technique called decentralized vector commitment (DVC). Towards this goal, we first propose a key exposure-freeness chameleon vector commitment scheme, and then present the efficient DVC technique based on our key exposure-freeness chameleon vector commitment scheme. The scheme is finally constructed by leveraging the efficient DVC technique. Besides the integrity verification, our scheme is also sufficient to efficiently distribute the data to a user who is protected from receiving the stale data. To optimize the performance in concurrently retrieving multiple data, we introduce the batch query that reduces large amounts of communication and computation overheads. The security analysis and performance evaluation show that our solutions are secure and efficient.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"487-501"},"PeriodicalIF":5.6,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HTLL: Latency-Aware Scalable Blocking Mutex 延迟感知的可伸缩阻塞互斥
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-07 DOI: 10.1109/TPDS.2025.3526859
Ziqu Yu;Jinyu Gu;Zijian Wu;Nian Liu;Jian Guo
{"title":"HTLL: Latency-Aware Scalable Blocking Mutex","authors":"Ziqu Yu;Jinyu Gu;Zijian Wu;Nian Liu;Jian Guo","doi":"10.1109/TPDS.2025.3526859","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526859","url":null,"abstract":"This article finds that existing mutex locks suffer from throughput collapses or latency collapses, or both, in the oversubscribed scenarios where applications create more threads than the CPU core number, e.g., database applications like mysql use per thread per connection. We make an in-depth performance analysis on existing locks and then identify three design rules for the lock primitive to achieve scalable performance in oversubscribed scenarios. First, to achieve ideal throughput, the lock design should keep adequate number of active competitors. Second, the active competitors should be arranged carefully to avoid the lock-holder preemption problem. Third, to meet latency requirements, the lock design should track the latency of each competitor and reorder the competitors according to the latency requirement. We propose a new lock library called HTLL that satisfies these rules and achieves both high throughput and low latency even when the cores are oversubscribed. HTLL only requires minimal human effort (e.g., add several lines of code) to annotate the latency requirement. Evaluation results show that HTLL achieves scalable performance in the oversubscribed scenarios. Specifically, for the real-world database, LMDB, HTLL can reduce the tail latency by up to 97% with only an average 5% degradation in throughput, compared with state-of-the-art alternatives such as Malthusian, CST, and Mutexee locks; In comparison to the widely used pthread_mutex_lock, it can increase the throughput by up to 22% and decrease the latency by up to 80%. Meanwhile, for the under-subscribed scenarios, it also shows comparable performance than state-of-the-art blocking locks.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"471-486"},"PeriodicalIF":5.6,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SMDP-Based Dynamic Batching for Improving Responsiveness and Energy Efficiency of Batch Services 基于smdp的动态批处理,提高批处理服务的响应能力和能源效率
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-06 DOI: 10.1109/TPDS.2025.3526283
Yaodan Xu;Sheng Zhou;Zhisheng Niu
{"title":"SMDP-Based Dynamic Batching for Improving Responsiveness and Energy Efficiency of Batch Services","authors":"Yaodan Xu;Sheng Zhou;Zhisheng Niu","doi":"10.1109/TPDS.2025.3526283","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526283","url":null,"abstract":"For servers incorporating parallel computing resources, batching is a pivotal technique for providing efficient and economical services at scale. Parallel computing resources exhibit heightened computational and energy efficiency when operating with larger batch sizes. However, in the realm of online services, the adoption of a larger batch size may lead to longer response times. This paper aims to provide a dynamic batching scheme that delicately balances latency and efficiency. The system is modeled as a batch service queue with size-dependent service times. Then, the design of dynamic batching is formulated as a semi-Markov decision process (SMDP) problem, with the objective of minimizing the weighted sum of average response time and average power consumption. A method is proposed to derive an approximate optimal SMDP solution, representing the chosen dynamic batching policy. By introducing an abstract cost to reflect the impact of “tail” states, the space complexity and the time complexity of the procedure can decrease by 63.5% and 98%, respectively. Numerical results showcase the superiority of SMDP-based batching policies across various parameter setups. Additionally, the proposed scheme exhibits noteworthy flexibility in balancing power consumption and latency.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"659-674"},"PeriodicalIF":5.6,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paralfetch: Fast Application Launch on Personal Computing/Communication Devices 在个人计算/通信设备上快速启动应用程序
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-02 DOI: 10.1109/TPDS.2024.3525337
Junhee Ryu;Dongeun Lee;Kang G. Shin;Kyungtae Kang
{"title":"Paralfetch: Fast Application Launch on Personal Computing/Communication Devices","authors":"Junhee Ryu;Dongeun Lee;Kang G. Shin;Kyungtae Kang","doi":"10.1109/TPDS.2024.3525337","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3525337","url":null,"abstract":"<monospace>Paralfetch</monospace> speeds up application launches on personal computing/communication devices, by means of: 1) accurate collection of launch-related disk read requests, 2) pre-scheduling of these requests to improve I/O throughput during prefetching, and 3) overlapping application execution with disk prefetching for hiding disk access time from the execution of the application. We implemented <monospace>Paralfetch</monospace> under Linux kernels on a desktop/laptop PC, a Raspberry Pi 3 board, and an Android smartphone. Tests with popular applications show that <monospace>Paralfetch</monospace> significantly reduces application launch times on flash-based drives and hard disk drives, and it outperforms <italic>GSoC Prefetch</i> Lichota et al. 2007 and <monospace>FAST</monospace> Joo et al. 2011, which are representative application prefetchers available for Linux-based systems.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"616-632"},"PeriodicalIF":5.6,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HpT: Hybrid Acceleration of Spatio-Temporal Attention Model Training on Heterogeneous Manycore Architectures 基于异构多核架构的时空注意力模型混合加速训练
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2025-01-01 DOI: 10.1109/TPDS.2024.3522781
Saiman Dahal;Pratyush Dhingra;Krishu Kumar Thapa;Partha Pratim Pande;Ananth Kalyanaraman
{"title":"HpT: Hybrid Acceleration of Spatio-Temporal Attention Model Training on Heterogeneous Manycore Architectures","authors":"Saiman Dahal;Pratyush Dhingra;Krishu Kumar Thapa;Partha Pratim Pande;Ananth Kalyanaraman","doi":"10.1109/TPDS.2024.3522781","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522781","url":null,"abstract":"Transformer models have become widely popular in numerous applications, and especially for building foundation large language models (LLMs). Recently, there has been a surge in the exploration of transformer-based architectures in non-LLM applications. In particular, the self-attention mechanism within the transformer architecture offers a way to exploit any hidden relations within data, making it widely applicable for a variety of spatio-temporal tasks in scientific computing domains (e.g., weather, traffic, agriculture). Most of these efforts have primarily focused on accelerating the inference phase. However, the computational resources required to train these attention-based models for scientific applications remain a significant challenge to address. Emerging non-volatile memory (NVM)-based processing-in-memory (PIM) architectures can achieve higher performance and better energy efficiency than their GPU-based counterparts. However, the frequent weight updates during training would necessitate write operations to NVM cells, posing a significant barrier for considering stand-alone NVM-based PIM architectures. In this paper, we present <monospace>HpT</monospace>, a new hybrid approach to accelerate the training of attention-based models for scientific applications. Our approach is hybrid at two different layers: at the software layer, our approach dynamically switches from a full-parameter training mode to a lower-parameter training mode by incorporating intrinsic dimensionality; and at the hardware layer, our approach harnesses the combined power of GPUs, resistive random-access memory (ReRAM)-based PIM devices, and systolic arrays. This software-hardware co-design approach is aimed at adaptively reducing both runtime and energy costs during the training phase, without compromising on quality. Experiments on four concrete real-world scientific applications demonstrate that our hybrid approach is able to significantly reduce training time (up to <inline-formula><tex-math>$11.9times$</tex-math></inline-formula>) and energy consumption (up to <inline-formula><tex-math>$12.05times$</tex-math></inline-formula>), compared to the corresponding full-parameter training executing on only GPUs. Our approach serves as an example for accelerating the training of attention-based models on heterogeneous platforms including ReRAMs.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"407-421"},"PeriodicalIF":5.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparrow: Expediting Smart Contract Execution for Blockchain Sharding via Inter-Shard Caching Sparrow:通过分片间缓存加速区块链分片的智能合约执行
IF 5.6 2区 计算机科学
IEEE Transactions on Parallel and Distributed Systems Pub Date : 2024-12-26 DOI: 10.1109/TPDS.2024.3522016
Junyuan Liang;Peiyuan Yao;Wuhui Chen;Zicong Hong;Jianting Zhang;Ting Cai;Min Sun;Zibin Zheng
{"title":"Sparrow: Expediting Smart Contract Execution for Blockchain Sharding via Inter-Shard Caching","authors":"Junyuan Liang;Peiyuan Yao;Wuhui Chen;Zicong Hong;Jianting Zhang;Ting Cai;Min Sun;Zibin Zheng","doi":"10.1109/TPDS.2024.3522016","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522016","url":null,"abstract":"Sharding is a promising solution to scale blockchain by separating the system into multiple shards to process transactions in parallel. However, due to state separation and shard isolation, it is still challenging to efficiently support smart contracts on a blockchain sharding system where smart contracts can interact with each other, involving states maintained by multiple shards. Specifically, existing sharding systems adopt a costly multi-step collaboration mechanism to execute smart contracts, resulting in long latency and low throughput. This article proposes <small>Sparrow</small>, a blockchain sharding protocol achieving one-step execution for smart contracts. To break shard isolation, inspired by non-local hotspot data caching in traditional databases, we propose a new idea of <i>inter-shard caching</i>, allowing a shard to prefetch and cache frequently accessed contract states of other shards. The miner can thus use the inter-shard cache to pre-execute a pending transaction, retrieve all its contract invocations, and commit it to multiple shards in one step. Particularly, we first propose a speculative dispersal cache synchronisation mechanism for efficient and secure cache synchronization across shards in Byzantine environments. Then, we propose a multi-branch exploration mechanism to solve the rollback problem during the optimistic one-step execution of contract invocations with dependencies. We also present a series of conflict resolution mechanisms to decrease the rollback caused by inherent transaction conflicts. We implement prototypes for <small>Sparrow</small> and existing sharding systems, and the evaluation shows that <small>Sparrow</small> improves the throughput by <inline-formula><tex-math>$2.44times$</tex-math></inline-formula> and reduces the transaction latency by 30% compared with the existing sharding systems.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"377-390"},"PeriodicalIF":5.6,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信