Journal of Systems Architecture最新文献

筛选
英文 中文
Privacy-preserving multidimensional data aggregation for diverse electricity data users 为不同电力数据用户提供保护隐私的多维数据聚合
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-16 DOI: 10.1016/j.sysarc.2025.103363
Huadong Liu , Yuanxing Peng , Yining Liu , Zhixin Zeng
{"title":"Privacy-preserving multidimensional data aggregation for diverse electricity data users","authors":"Huadong Liu ,&nbsp;Yuanxing Peng ,&nbsp;Yining Liu ,&nbsp;Zhixin Zeng","doi":"10.1016/j.sysarc.2025.103363","DOIUrl":"10.1016/j.sysarc.2025.103363","url":null,"abstract":"<div><div>The smart grid enables the bidirectional flow of electricity and data, but during multidimensional electricity consumption data reporting, Smart Meters (SMs) may compromise users’ privacy by disclosing detailed electricity consumption data from various devices. Additionally, in addressing the needs of diverse electricity Data Users (DUs), it is essential to protect their legal rights. Consequently, it is necessary to implement dimension-level access control for multidimensional aggregated data. However, existing Multidimensional Data Aggregation (MDA) schemes often fail to provide efficient dimension-level access control while safeguarding users’ privacy. To address these problems, this paper establishes a smart grid model based on fog computing and introduces a privacy-preserving MDA scheme with dimension-level access control. Specifically, our scheme utilizes the threshold Paillier cryptosystem and the Chinese Remainder Theorem (CRT) to securely aggregate users’ multidimensional data. Meanwhile, our scheme employs digital signatures to ensure data integrity and implements Key-Policy Attribute-Based Encryption (KP-ABE) to enforce dimension-level access control. Comprehensive theoretical analysis indicates our scheme satisfies privacy, integrity, and authentication. Extensive experimental results demonstrate our scheme achieves a trade-off between dimension level access control and system overhead.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"161 ","pages":"Article 103363"},"PeriodicalIF":3.7,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143453418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing secure blockchain-based authentication and key management mechanism for Internet of Drones applications 为无人机互联网应用设计基于区块链的安全认证和密钥管理机制
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103365
Mohammad Wazid , Saksham Mittal , Ashok Kumar Das , SK Hafizul Islam , Mohammed J.F. Alenazi , Athanasios V. Vasilakos
{"title":"Designing secure blockchain-based authentication and key management mechanism for Internet of Drones applications","authors":"Mohammad Wazid ,&nbsp;Saksham Mittal ,&nbsp;Ashok Kumar Das ,&nbsp;SK Hafizul Islam ,&nbsp;Mohammed J.F. Alenazi ,&nbsp;Athanasios V. Vasilakos","doi":"10.1016/j.sysarc.2025.103365","DOIUrl":"10.1016/j.sysarc.2025.103365","url":null,"abstract":"<div><div>Due to advancement in Information and Communications Technology (ICT) and Internet of Things (IoT), the Internet of Drones (IoD) can be employed in numerous applications, facilitating the daily lives of diverse users, including civilians and others. Wireless communication nature leads to an IoD environment to be vulnerable to various potential attack risks, such as data breaches, man-in-the-middle, impersonation, replay, and data leaking attacks. As a result, the security of the IoD environment becomes crucial. To safeguard the data and devices (such as IoT-enabled drones and servers) integral to IoD networks, a security solution is essential. It is imperative to implement targeted security measures, such as intrusion detection, access control, and authentication, in order to establish a security scheme that is both reliable and efficient. In this article, we mainly focus on developing a secure authentication and key management scheme that leverages blockchain technology. Most existing authentication techniques proposed in IoT and IoD environments are either inefficient in communication and computation, or they are insecure against various attacks. To mitigate these issues, this study proposes a secure blockchain-based authentication and key management scheme for IoD applications (in short BAKMM-IoD). The blockchain is applied here as a secure data storage purpose. After performing a detailed security analysis and formal security verification with the widely-recognized Scyther tool, the proposed BAKMM-IoD has exhibited resilience against different potential attacks. BAKMM-IoD also surpasses other contemporary existing schemes in terms of security and functionality features, including computational costs, and communication costs. Moreover, the blockchain simulation shows that the influence of the proposed BAKMM-IoD on critical performance metrics in real-world scenarios.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103365"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An effective and verifiable secure aggregation scheme with privacy-preserving for federated learning 一种有效且可验证的保护隐私的安全聚合方案
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103364
Rong Wang , Ling Xiong , Jiazhou Geng , Chun Xie , Ruidong Li
{"title":"An effective and verifiable secure aggregation scheme with privacy-preserving for federated learning","authors":"Rong Wang ,&nbsp;Ling Xiong ,&nbsp;Jiazhou Geng ,&nbsp;Chun Xie ,&nbsp;Ruidong Li","doi":"10.1016/j.sysarc.2025.103364","DOIUrl":"10.1016/j.sysarc.2025.103364","url":null,"abstract":"<div><div>Federated learning has gained significant attention for enabling collaborative model training on distributed devices while maintaining data privacy. However, sharing gradients poses risks to local data privacy. This paper presents a secure aggregation scheme that addresses privacy protection and verifiability in federated learning. Firstly, a new homomorphic signature algorithm has been used to verify the aggregation results. For efficient verification, this algorithm can be divided into an offline phase and an online phase, where results are pre-computed during the offline phase and reused. Secondly, we use the symmetric homomorphic encryption lightweight algorithm to generate public keys, greatly accelerating the key generation process, making both encryption and decryption particularly efficient. Under this architecture, the aggregation server is unable to peek into the specific content of each gradient. The task management center cannot access the client’s individual gradient and can only process the aggregated information. This design ensures that the aggregation server and task management center can only access information within their permissions, effectively preventing information leakage. Finally, the security assessment indicates that our method satisfies the essential security standards for privacy-preserving federated learning. Comprehensive experimental evaluations conducted on real-world datasets reveal that the proposed solution demonstrates impressive efficiency in practical applications.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"161 ","pages":"Article 103364"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SmartDeCoup: Decoupling the STT-RAM LLC for even write distribution and lifetime improvement smartdecoupling:对STT-RAM LLC进行解耦,以实现均匀的写入分布和寿命改善
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103367
Prabuddha Sinha , Krishna Prathik B.V. , Shirshendu Das , Venkata Kalyan Tavva
{"title":"SmartDeCoup: Decoupling the STT-RAM LLC for even write distribution and lifetime improvement","authors":"Prabuddha Sinha ,&nbsp;Krishna Prathik B.V. ,&nbsp;Shirshendu Das ,&nbsp;Venkata Kalyan Tavva","doi":"10.1016/j.sysarc.2025.103367","DOIUrl":"10.1016/j.sysarc.2025.103367","url":null,"abstract":"<div><div>Static Random Access Memory (SRAM) based Last Level Caches (LLCs) is losing its edge to Non-Volatile Memories (NVMs) like Spin-Transfer Torque RAM (STT-RAM) which offer advantages including higher density and lower static power consumption. However, they have drawbacks, namely, higher write latency, higher write power consumption, and lower write endurance. Uneven distribution of writes leads to reduced write endurance. Existing endurance enhancement techniques focus on reducing write variation to extend the lifetime. Additionally, these techniques cannot be implemented on top of recent secure cache designs that protect LLCs from timing channel attacks. They cannot prevent recently proposed endurance attacks on the STT-RAM LLC. SmartDeCoup proposes a decoupled tag/data array structure for STT-RAM LLCs and, on top of this structure, introduces two approaches to enhance LLC lifetime through: (a) the Primal Approach, and (b) the Hardware Efficient Approach. The Primal Approach achieves a maximum relative lifetime improvement of 24.99<span><math><mo>×</mo></math></span> and 33.13<span><math><mo>×</mo></math></span> in single core and multicore systems, with a 8.79% area overhead. The Hardware Efficient Approach achieves improvements of 22.47<span><math><mo>×</mo></math></span> and 31.83<span><math><mo>×</mo></math></span>, with a 7.23% area overhead. The Primal Approach is capable of preventing endurance attacks and is also compatible with the recently proposed countermeasures for timing channel attacks on LLC.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"161 ","pages":"Article 103367"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dual-index Boolean retrieval scheme with dynamic and revocable attribute-based policies 具有动态和可撤销的基于属性的策略的双索引布尔检索方案
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103366
Jingting Xue , Qinfang Deng , Wenzheng Zhang , Kangyi Liu , Xiaojun Zhang , Yu Zhou
{"title":"A dual-index Boolean retrieval scheme with dynamic and revocable attribute-based policies","authors":"Jingting Xue ,&nbsp;Qinfang Deng ,&nbsp;Wenzheng Zhang ,&nbsp;Kangyi Liu ,&nbsp;Xiaojun Zhang ,&nbsp;Yu Zhou","doi":"10.1016/j.sysarc.2025.103366","DOIUrl":"10.1016/j.sysarc.2025.103366","url":null,"abstract":"<div><div>Boolean retrieval is widely employed in information retrieval due to its versatile operator combinations. In the context of secure cloud data sharing, data owners can tailor retrieval authorizations, offering remote nodes a convenient way to access data. Nevertheless, traditional retrieval models depend on online interactions for authorization, and fixed policies restrict control over retrieval. Constrained by index structures, existing Boolean retrieval methods encounter performance bottlenecks in terms of retrieval speed and storage efficiency. In this paper, we propose a dual-index Boolean retrieval scheme, <span><math><mrow><mi>d</mi><mi>i</mi><mi>b</mi><mi>R</mi><mi>S</mi></mrow></math></span>, that incorporates dynamic and revocable attribute-based policies. Specifically, leveraging attribute-based zero-knowledge proofs (AB-ZKP), we construct the authorization verification structure using Lagrange interpolation polynomials. By constructing a dual-index structure that integrates both inverted and forward indexes, <span><math><mrow><mi>d</mi><mi>i</mi><mi>b</mi><mi>R</mi><mi>S</mi></mrow></math></span> facilitates efficient Boolean retrieval. A puncturable pseudorandom function constructs the forward index, enabling selective revocation of search trapdoors through puncturing, without requiring full index regeneration. Additionally, by utilizing chameleon hash collisions, <span><math><mrow><mi>d</mi><mi>i</mi><mi>b</mi><mi>R</mi><mi>S</mi></mrow></math></span> allows customizable index modifications and dynamic policy updates on redactable blockchains. Throughout this process, <span><math><mrow><mi>d</mi><mi>i</mi><mi>b</mi><mi>R</mi><mi>S</mi></mrow></math></span> enables non-interactive authorization, significantly alleviating the communication burden on data owners. Finally, we demonstrate the adaptive security and computational feasibility of <span><math><mrow><mi>d</mi><mi>i</mi><mi>b</mi><mi>R</mi><mi>S</mi></mrow></math></span>.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"161 ","pages":"Article 103366"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143453969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantum-safe identity-based designated verifier signature for BIoMT 基于量子安全身份的生物医学指定验证者签名
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103362
Chaoyang Li , Yuling Chen , Mianxiong Dong , Jian Li , Min Huang , Xiangjun Xin , Kaoru Ota
{"title":"Quantum-safe identity-based designated verifier signature for BIoMT","authors":"Chaoyang Li ,&nbsp;Yuling Chen ,&nbsp;Mianxiong Dong ,&nbsp;Jian Li ,&nbsp;Min Huang ,&nbsp;Xiangjun Xin ,&nbsp;Kaoru Ota","doi":"10.1016/j.sysarc.2025.103362","DOIUrl":"10.1016/j.sysarc.2025.103362","url":null,"abstract":"<div><div>Blockchain technology changes the centralized management form in traditional healthcare systems and constructs the distributed and secure medical data-sharing mechanism to achieve data value maximization. However, the advanced capabilities of quantum algorithms bring a serious threat to current blockchain cryptographic algorithms which are based on classical mathematical difficulties. This paper proposes the first quantum-safe identity-based designated verifier signature (ID-DVS) scheme for blockchain-based Internet of medical things (BIoMT) systems. This scheme is constructed based on the lattice assumption of the short integer solution (SIS) problem, which is believed to resist the quantum attack. The identity mechanism helps to establish a transaction traceability mechanism when this data is shared among different medical institutions. The designated verifier mechanism also prevents unauthorized users from accessing data to improve the security of medical data-sharing processes. Next, this ID-DVS scheme is proved in random oracle model, which can achieve the security properties of anonymity and unforgeability. It also can capture the post-quantum security. Then, the performance analysis of the key size and time consumption are presented, and the results show that this ID-DVS is more efficient than other similar schemes. Therefore, this work supports secure medical data-sharing and protects the privacy of users and medical data.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103362"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lightweight batch authentication and key agreement scheme for IIoT gateways 轻量级批量认证和密钥协议方案的IIoT网关
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-15 DOI: 10.1016/j.sysarc.2025.103368
Xiaohui Ding , Jian Wang , Yongxuan Zhao , Zhiqiang Zhang
{"title":"Lightweight batch authentication and key agreement scheme for IIoT gateways","authors":"Xiaohui Ding ,&nbsp;Jian Wang ,&nbsp;Yongxuan Zhao ,&nbsp;Zhiqiang Zhang","doi":"10.1016/j.sysarc.2025.103368","DOIUrl":"10.1016/j.sysarc.2025.103368","url":null,"abstract":"<div><div>Existing authentication and key agreement (AKA) schemes face two primary challenges in IIoT, where users dynamically communicate with multiple industrial devices. The first is significant computational and communication overhead, along with security vulnerabilities. Another is inability to achieve gateway lightweight solutions. To address these issues, this paper proposes a gateway lightweight batch AKA scheme based on elliptic curve cryptography for IIoT. When users access multiple industrial devices, they only need to send a batch authentication request to the gateway. Based on this request, the gateway generates a time-limited token combining Chinese Remainder Theorem (CRT), enabling users to efficiently complete AKA with multiple devices in batch manner. Furthermore, the application of the CRT allows the gateway to efficiently update the time-limited token when the user’s accessed devices change. Finally, due to the use of the time-limited token, the entire scheme process requires only one round of interaction between the gateway and the user, ensuring a lightweight nature of the gateway. The security of the proposed scheme is proved through formal security proofs, heuristic analysis, and scyther tools. Performance analysis shows that, compared to the compared schemes, the proposed scheme meets all listed security requirements with the lower computational and communication overheads.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103368"},"PeriodicalIF":3.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EDF-based Energy-Efficient Probabilistic Imprecise Mixed-Criticality Scheduling 基于edf的能效概率不精确混合临界调度
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-12 DOI: 10.1016/j.sysarc.2025.103361
Yi-Wen Zhang, Jin-Long Zhang
{"title":"EDF-based Energy-Efficient Probabilistic Imprecise Mixed-Criticality Scheduling","authors":"Yi-Wen Zhang,&nbsp;Jin-Long Zhang","doi":"10.1016/j.sysarc.2025.103361","DOIUrl":"10.1016/j.sysarc.2025.103361","url":null,"abstract":"<div><div>We focus on Mixed-Criticality Systems (MCS), which involves the integration of multiple subsystems with varying levels of criticality on shared hardware platforms. The classic MCS task model assumes hard real-time constraints and no Quality-of-Service (QoS) for low-criticality tasks in high-criticality mode. Many researchers have put forward a range of extensions to the classic MCS task model to make MCS theory more applicable in industry practice. In this paper, we consider an Imprecise MCS taskset scheduled with Earliest Deadline First algorithm on a uniprocessor platform, and propose an Energy-Efficient Task Execution Model that guarantees (deterministic or probabilistic) schedulability, allows degraded QoS to low-criticality tasks in high-criticality mode, and applies Dynamic Voltage and Frequency Scaling to save energy.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103361"},"PeriodicalIF":3.7,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative optimization of offloading and pricing strategies in dynamic MEC system via Stackelberg game 基于Stackelberg博弈的动态MEC系统卸载与定价策略协同优化
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-12 DOI: 10.1016/j.sysarc.2025.103360
Jing Mei , Cuibin Zeng , Zhao Tong , Longbao Dai , Keqin Li
{"title":"Collaborative optimization of offloading and pricing strategies in dynamic MEC system via Stackelberg game","authors":"Jing Mei ,&nbsp;Cuibin Zeng ,&nbsp;Zhao Tong ,&nbsp;Longbao Dai ,&nbsp;Keqin Li","doi":"10.1016/j.sysarc.2025.103360","DOIUrl":"10.1016/j.sysarc.2025.103360","url":null,"abstract":"<div><div>The rapid advancement of 5G technology has indirectly propelled the growth of connected devices within the Internet of Things (IoT). Within the IoT domain, mobile edge computing (MEC) has demonstrated potential in task processing. However, as computational services expand, the reliable determination of user offloading strategies and the rational establishment of service prices offered by servers to users continue to present challenging research directions. The primary focus of this paper revolves around task offloading in the MEC system, encompassing numerous user terminal devices that support energy harvesting (EH), a MEC server and a central cloud server. The optimization goals are to maximize the utilities for both users and the MEC server by adjusting offloading and pricing strategies. To guarantee the task queue’s stability within the system and achieve a reasonable allocation of system resources, we propose a dynamic task offloading approach rooted in Lyapunov optimization theory and Stackelberg game theory. In this algorithm, the MEC server takes on the role of the leader, while each user terminal device acts as the follower. Aiming at the game equilibrium existence of the algorithm, a series of mathematical analysis is carried out. Additionally, we conduct extensive simulation experiments to validate the proposed algorithm’s effectiveness. The proposed algorithm achieves improvements in user utility, with a 6.43% increase compared to the average time-constrained task offloading (ATCTO) scheme, a 61.80% improvement over the local-only processing (LOP) scheme, and a 23.97% enhancement over the genetic algorithm (GA) scheme. Meanwhile, it achieves a task queue backlog reduction of 50.00% compared to ATCTO, 70.00% compared to LOP and 15.28% compared to GA.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103360"},"PeriodicalIF":3.7,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GTA: Generating high-performance tensorized program with dual-task scheduling GTA:生成具有双任务调度的高性能张化程序
IF 3.7 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-02-07 DOI: 10.1016/j.sysarc.2025.103359
Anxing Xie , Yonghua Hu , Yaohua Wang , Zhe Li , Yuxiang Gao , Zenghua Cheng
{"title":"GTA: Generating high-performance tensorized program with dual-task scheduling","authors":"Anxing Xie ,&nbsp;Yonghua Hu ,&nbsp;Yaohua Wang ,&nbsp;Zhe Li ,&nbsp;Yuxiang Gao ,&nbsp;Zenghua Cheng","doi":"10.1016/j.sysarc.2025.103359","DOIUrl":"10.1016/j.sysarc.2025.103359","url":null,"abstract":"<div><div>Generating high-performance tensorized programs for deep learning accelerators (DLAs) is crucial for ensuring the efficient execution of deep neural networks. But, producing such programs for different operators across various DLAs is notoriously challenging. Existing methods utilize hardware abstraction to represent acceleration intrinsics, enabling end-to-end automated exploration of the intrinsics mapping space. However, their limited search space and inefficient exploration strategies often result in suboptimal tensorized programs and significant search time overhead.</div><div>In this paper, we propose GTA, a framework designed to generate high-performance tensorized programs for DLAs. Unlike existing deep learning compilers, we first coordinate intrinsic-based mapping abstraction with rule-based program generation strategy, followed by the application of resource-constrained rules to eliminate ineffective tensor program candidates from the search space. Second, we employ a dual-task scheduling strategy to allocate tuning resources across multiple subgraphs of deep neural networks and their mapping candidates. As a result, GTA can find high-performance tensor programs that are outside the search space of existing state-of-the-art methods. Our experiments show that GTA achieves an average speedup of more than 1.88<span><math><mo>×</mo></math></span> over AMOS and 2.29<span><math><mo>×</mo></math></span> over Ansor on NVIDIA GPU with Tensor Core, as well as 1.49<span><math><mo>×</mo></math></span> over Ansor and 2.76<span><math><mo>×</mo></math></span> over PyTorch on CPU with AVX512.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"160 ","pages":"Article 103359"},"PeriodicalIF":3.7,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143376982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信