Journal of Systems Architecture最新文献

筛选
英文 中文
A Lightweight and Generic Access Rights Update Mechanism for Attribute-Based Encryption in cloud storage 云存储中基于属性的加密的轻量级通用访问权限更新机制
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-23 DOI: 10.1016/j.sysarc.2025.103550
Zhiqiang Zhang , Youwen Zhu , Xiaohui Ding , Jian Wang , Junbeom Hur
{"title":"A Lightweight and Generic Access Rights Update Mechanism for Attribute-Based Encryption in cloud storage","authors":"Zhiqiang Zhang ,&nbsp;Youwen Zhu ,&nbsp;Xiaohui Ding ,&nbsp;Jian Wang ,&nbsp;Junbeom Hur","doi":"10.1016/j.sysarc.2025.103550","DOIUrl":"10.1016/j.sysarc.2025.103550","url":null,"abstract":"<div><div>Cloud storage has emerged as a foundational tool in managing massive volumes of sensitive data across diverse domains. Attribute-Based Encryption provides fine-grained access control to encrypted data in these scenarios. However, the practical application of ABE in cloud environments faces significant challenges, including lack of a generic mechanism for updating access rights and inefficiencies in handling large-scale data. Existing methods suffer from high computational overhead and rely heavily on fully trusted clouds, limiting scalability and increasing privacy risks. To address these challenges, we propose a Lightweight and Generic Access Rights Update Mechanism (LGUM) for ABE. LGUM offers a universal framework for access rights update, leveraging precise ciphertext update components to minimize computation and communication costs. It supports efficient multi-user access rights update with constant complexity <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></mrow></math></span>, eliminating reliance on fully trusted clouds. And then, we exhibit its generality across various ABE paradigms, including CP-ABE and KP-ABE. Comprehensive performance evaluations demonstrate significant reductions in communication and computational overhead compared to existing approaches, with formal security guarantees based on the BDH and MBDH assumptions.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103550"},"PeriodicalIF":4.1,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144895120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safeguarding user data privacy in online Large Language Model services 保护在线大型语言模型服务中的用户数据隐私
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-23 DOI: 10.1016/j.sysarc.2025.103555
Tianyu Bai, Yunhe Feng, Song Fu
{"title":"Safeguarding user data privacy in online Large Language Model services","authors":"Tianyu Bai,&nbsp;Yunhe Feng,&nbsp;Song Fu","doi":"10.1016/j.sysarc.2025.103555","DOIUrl":"10.1016/j.sysarc.2025.103555","url":null,"abstract":"<div><div>Large Language Models (LLMs), such as GPT, have become central to modern AI applications, including conversational agents, language translation, and document processing. Due to their computational demands, these models are typically hosted on remote servers, requiring users to transmit potentially sensitive data for inference. This raises serious privacy concerns, as user inputs may contain personally identifiable information (PII) and are vulnerable to misuse or unauthorized retention.</div><div>To address this challenge, we present PPGPT, a novel and practical privacy-preserving GPT framework. PPGPT employs additive secret sharing to protect user input by enabling secure inference on secret shares rather than raw data. We design secure versions of key transformer components, including GELU and Softmax layers using Beaver’s triples and Taylor series, and introduce an optimized secure layer normalization protocol to reduce overhead.</div><div>Experimental results show that PPGPT achieves comparable generation quality to the base model, with a negligible logits difference of <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>−</mo><mn>5</mn></mrow></msup></mrow></math></span> and an average inference time of 1.98 s. The framework is lightweight, generalizable to transformer-based LLMs, and suitable for deployment in real-world online services requiring strong privacy guarantees.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103555"},"PeriodicalIF":4.1,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145019005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A+Store: An Asynchronous Parallel Compaction for Multi-NDP-Enabled Key–Value Store A+Store:支持多ndp的键值存储异步并行压缩
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-21 DOI: 10.1016/j.sysarc.2025.103549
Hui Sun , Bo Chen , Jiaming Huang , Qiang Wang , Xiaole Liu , Yi Zhou , Yinliang Yue , Xiao Qin
{"title":"A+Store: An Asynchronous Parallel Compaction for Multi-NDP-Enabled Key–Value Store","authors":"Hui Sun ,&nbsp;Bo Chen ,&nbsp;Jiaming Huang ,&nbsp;Qiang Wang ,&nbsp;Xiaole Liu ,&nbsp;Yi Zhou ,&nbsp;Yinliang Yue ,&nbsp;Xiao Qin","doi":"10.1016/j.sysarc.2025.103549","DOIUrl":"10.1016/j.sysarc.2025.103549","url":null,"abstract":"<div><div>LSM-tree-based key–value stores face significant I/O bandwidth consumption and performance bottlenecks due to frequent data rewrites and migrations during compaction. To address this issue, near-data processing (NDP) technology has emerged as a promising solution and is gaining increasing attention. NDP reduces the data transfer distance between storage and processing resources by placing computational resources closer to storage devices or integrating them into memory, thereby effectively alleviating performance bottlenecks. However, existing multi-NDP key–value stores still face synchronization problems, leading to long wait times and underutilization of resources. To address these issues, we propose an asynchronous parallel compaction for multi-NDP-enabled key–value store named <strong>A</strong><span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span><strong>Store</strong>. In A<span><math><msup><mrow></mrow><mrow><mo>+</mo></mrow></msup></math></span>Store, to optimize data layout, we implement an MLSM-tree on each NDP device, an asynchronous execution queue for dynamic task management, and an independent metadata management method. This asynchronous mechanism allows each NDP device to update its metadata immediately after completing a compaction task rather than wait for other devices, thereby eliminating synchronization waiting time among NDP devices. Additionally, as each NDP stores SSTables really within specific key ranges; thus, the device can perform sub-compaction tasks in parallel according to its key range, significantly enhancing the execution speed of tasks within each NDP device. This approach can improve the system’s parallel processing capability and resource utilization, addressing the bottlenecks in existing multi-NDP KV stores in applications with the requirements of large-scale data processing and low latency. To evaluate the performance of A<sup>+</sup>Store, we compare A<sup>+</sup>Store against state-of-the-art KV stores, including PStore, MStore, and RocksDB (configured with a RAID architecture). We develop a tested toolkit using the real-world dataset OpenAlex, and study the performance of A<sup>+</sup>Store under realistic workloads. Experimental results show that A<sup>+</sup>Store demonstrates superior performance across all tests. For example, when loading 100 GB of writes, A<sup>+</sup>Store achieves 2.87<span><math><mo>×</mo></math></span> the throughput of PStore and 2<span><math><mo>×</mo></math></span> that of MStore, while reducing write amplification by 65.3% and 24.8% compared to PStore and MStore – NDP-empowered KV stores, respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103549"},"PeriodicalIF":4.1,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144892081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new fixed-point simulation methodology for on-device AI based on large language models 一种新的基于大型语言模型的设备上人工智能定点仿真方法
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-21 DOI: 10.1016/j.sysarc.2025.103548
Jung-Woo Kim, Seung-Hwan Yoon, Dong-Kyeong Kang, Seong-Won Lim, Hak-Bum Lee, Su-Min Oh, Young-Ho Seo
{"title":"A new fixed-point simulation methodology for on-device AI based on large language models","authors":"Jung-Woo Kim,&nbsp;Seung-Hwan Yoon,&nbsp;Dong-Kyeong Kang,&nbsp;Seong-Won Lim,&nbsp;Hak-Bum Lee,&nbsp;Su-Min Oh,&nbsp;Young-Ho Seo","doi":"10.1016/j.sysarc.2025.103548","DOIUrl":"10.1016/j.sysarc.2025.103548","url":null,"abstract":"<div><div>Large language models (LLMs) have demonstrated outstanding performance across various natural language processing tasks, and their utilization in on-device environments is gradually increasing. This paper proposes a full integer arithmetic (fixed-point arithmetic) methodology utilizing fixed-point simulation to optimize the LLaMA3-8B-Instruct model for on-device hardware development. The proposed approach enables integer computations without performance degradation on the MMLU benchmark. Conventional quantization methods primarily focus on integer conversion of weight matrix multiplication operations; however, they require subsequent floating-point restoration, which can lead to computational bottlenecks. In contrast, this paper eliminates such floating-point dependencies by converting all operations, including SoftMax, layer normalization and activation functions, into fixed-point integer formats. Furthermore, to maintain the accuracy in the integer computation, we partition the model’s computational graph into repeatable and one-to-one nodes (RONs) and hierarchically determine integer and fractional bit-widths, ensuring that the pre-trained parameters and the bit-width of constants and initial values used in inference are optimized. Experimental results show that the proposed approach maintains the same accuracy as the FP16/FP32 baseline while achieving up to a 84.67% reduction in hardware resource usage and approximately 16× inference speed-up, as analyzed using the Synopsys Design Compiler. This demonstrates that fully integer computation of LLMs can simultaneously achieve high performance and efficiency.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103548"},"PeriodicalIF":4.1,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144895122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WEnSIBR: Compression using dynamic bases supported with encoding and zone wise wear leveling for NVMs WEnSIBR:使用支持编码和区域磨损均衡的动态基进行压缩
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-20 DOI: 10.1016/j.sysarc.2025.103543
Swati Upadhyay , Arijit Nath , Hemangee K. Kapoor
{"title":"WEnSIBR: Compression using dynamic bases supported with encoding and zone wise wear leveling for NVMs","authors":"Swati Upadhyay ,&nbsp;Arijit Nath ,&nbsp;Hemangee K. Kapoor","doi":"10.1016/j.sysarc.2025.103543","DOIUrl":"10.1016/j.sysarc.2025.103543","url":null,"abstract":"<div><div>Non-Volatile Memories (NVM) are potential candidates to replace DRAM in main memory with a few intrinsic flaws associated with their writes, like poor endurance, excessive energy consumption, and long latency. Bit-flips due to costly write activities degrade the memory lifetime. We proposed SIBR, which compresses the incoming cacheblocks before writing in memory. Compressed blocks lead to lesser write activities in the memory cells. For further bit-flip reduction, we propose an extension of SIBR called EnSIBR that performs encoding on the SIBR-generated compressed blocks with relatively low storage overhead. However, only a small portion of memory, corresponding to the compressed block, is involved in write activity, leading to skewed bit-flip reception by memory cells.</div><div>Our second contribution is an intra-line wear-leveling method, which makes logical partitions of the memory lines called zones. We use the concept of <em>age</em> of a zone, which indicates the number of writes the zone has incurred over time. We also consider the number of bit-flips experienced by the zones with each write. The proposal, WEnSIBR, evens out the skewed distribution created by SIBR and EnSIBR. The novelty of WEnSIBR lies in its ability to uniformly distribute and simultaneously reduce bit-flips, which boosts the NVM lifetime significantly.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103543"},"PeriodicalIF":4.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144892987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prism: An efficient file mapping mechanism across multiple namespaces in mobile systems Prism:移动系统中跨多个名称空间的高效文件映射机制
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-20 DOI: 10.1016/j.sysarc.2025.103547
Xianzhang Chen , Yuqi Liu , Xijie Zhu , Lin Chen , Qiao Sun , Shukan Liu
{"title":"Prism: An efficient file mapping mechanism across multiple namespaces in mobile systems","authors":"Xianzhang Chen ,&nbsp;Yuqi Liu ,&nbsp;Xijie Zhu ,&nbsp;Lin Chen ,&nbsp;Qiao Sun ,&nbsp;Shukan Liu","doi":"10.1016/j.sysarc.2025.103547","DOIUrl":"10.1016/j.sysarc.2025.103547","url":null,"abstract":"<div><div>Existing mobile operating systems, such as Android, employ a unidirectional file mapping model to share files across different components of the systems. For example, a real file stored in F2FS of the Kernel layer may be mapped to the abstract file system in the Framework layer and the distributed file system in the Application layer. In this model, the namespaces of the file systems in different layers maintain varied metadata for the same real file, with modifications only propagating from top to bottom. Thus, the metadata of a real file in different namespaces may be inconsistent once the underlying file systems updates the real file without notifying the mapping namespaces, leading to a “mapping invalidation” issue. In this paper, we construct a new bidirectional model for mapping files in mobile systems. Specifically, we design a mechanism called <em>Prism</em> to enable bidirectional file mapping among multiple namespaces transparently. The core idea is to introduce a <em>Shared-Inode</em> (SInode) structure in the system’s intermediate layer to manage shared essential metadata. Prism provides two synchronization modes, i.e., immediate synchronization mode and delayed synchronization mode, to update the metadata changes from the lower namespaces to the upper namespaces in different scenarios. We implement Prism in Android 13 and evaluate it with real applications. Extensive experimental results show that Prism effectively resolves the “mapping invalidation” issue, particularly for image re-compression tasks. Experimental results demonstrate that our design reduces CPU and memory overhead by nearly 50% compared to the original system while having a trivial impact on overall performance.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103547"},"PeriodicalIF":4.1,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144895121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T-TNet: A dual-dependency trigger framework for active defense and hierarchical access control via multi-domain information fusion T-TNet:一种基于多域信息融合的主动防御和分层访问控制的双依赖触发框架
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-18 DOI: 10.1016/j.sysarc.2025.103544
Yizhun Zhang , Jie Huang , Peihao Li , Zeping Zhang , Changhao Ding
{"title":"T-TNet: A dual-dependency trigger framework for active defense and hierarchical access control via multi-domain information fusion","authors":"Yizhun Zhang ,&nbsp;Jie Huang ,&nbsp;Peihao Li ,&nbsp;Zeping Zhang ,&nbsp;Changhao Ding","doi":"10.1016/j.sysarc.2025.103544","DOIUrl":"10.1016/j.sysarc.2025.103544","url":null,"abstract":"<div><div>As deep neural networks (DNNs) are increasingly deployed in multi-sensor fusion systems operating under low-security conditions, they are exposed to serious threats such as model theft, parameter tampering, and unauthorized usage. To address these challenges, this paper proposes an active defense framework based on personalized frequency-domain triggers—T-TNet (TPS-Transform Triggered Defense Network). T-TNet fuses diverse feature information with a user-defined private point set through a dedicated fusion mechanism, enabling fine-grained behavior regulation and secure access control. On authorized data, T-TNet generates normal predictions; on unauthorized data, it outputs misleading predictions, thereby significantly enhancing model security. Experimental results demonstrate that, compared to baseline accuracy, T-TNet’s performance drops by no more than 2% while limiting the prediction accuracy on unauthorized data to below 5%. Moreover, T-TNet improves overall predictive performance by 24.69% compared to the latest research. This innovative framework offers a proactive defense strategy for protecting the intellectual property of deep learning models, particularly in low-security multi-sensor environments.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103544"},"PeriodicalIF":4.1,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144886948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TrustDedup: Secure data deduplication for IoT based on end–edge–cloud collaboration TrustDedup:基于端到端云协同的物联网安全重复数据删除
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-14 DOI: 10.1016/j.sysarc.2025.103541
Xin Yao , Chenxi Li , Jiawei Guo , Kecheng Huang , Ting Yao , Ming Zhao
{"title":"TrustDedup: Secure data deduplication for IoT based on end–edge–cloud collaboration","authors":"Xin Yao ,&nbsp;Chenxi Li ,&nbsp;Jiawei Guo ,&nbsp;Kecheng Huang ,&nbsp;Ting Yao ,&nbsp;Ming Zhao","doi":"10.1016/j.sysarc.2025.103541","DOIUrl":"10.1016/j.sysarc.2025.103541","url":null,"abstract":"<div><div>In Internet of Things (IoT)-enabled smart societies, the rapid growth of IoT devices has led to a substantial amount of redundant data stored in the cloud, significantly reducing storage efficiency. Although data deduplication effectively addresses redundancy, it introduces security concerns related to data confidentiality and ownership verification, particularly in semi-trusted cloud environments. Current deduplication methods primarily focus on cloud-only models and fail to accommodate the emerging end–edge–cloud collaborative framework driven by edge computing. To address these challenges, this paper proposes <span>TrustDedup</span>, a secure and efficient deduplication scheme that integrates edge–cloud collaboration and blockchain technology. The proposed scheme employs convergent encryption for secure data deduplication, uses blockchain-based smart contracts for transparent ownership verification, and includes a two-tiered deduplication approach to enhance efficiency and mitigate label inconsistency attacks. Security analyses and experimental results demonstrate that the proposed solution effectively improves deduplication efficiency and ensures robust data security in IoT scenarios.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103541"},"PeriodicalIF":4.1,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144886955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient adaptive bandwidth allocation for deadline-aware online admission control in centralized time-sensitive networking 集中式时间敏感网络中基于截止日期感知的在线准入控制的有效自适应带宽分配
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-14 DOI: 10.1016/j.sysarc.2025.103539
Sifan Yu, Feng He, Anlan Xie, Luxi Zhao
{"title":"Efficient adaptive bandwidth allocation for deadline-aware online admission control in centralized time-sensitive networking","authors":"Sifan Yu,&nbsp;Feng He,&nbsp;Anlan Xie,&nbsp;Luxi Zhao","doi":"10.1016/j.sysarc.2025.103539","DOIUrl":"10.1016/j.sysarc.2025.103539","url":null,"abstract":"<div><div>With the growing demand for dynamic real-time applications, online admission control for time-critical event-triggered (ET) traffic in Time-Sensitive Networking (TSN) has become a critical challenge. The main issue lies in dynamically allocating bandwidth with real-time guarantees in response to traffic changes. This also demands rapid responsiveness, scalability, and efficient resource utilization for online applicability. To address this challenge, we propose an online admission control method for ET traffic based on a combined asynchronous traffic shaper (ATS, IEEE 802.1Qcr) and credit-based shaper (CBS, IEEE 802.1Qav) architecture. This method provides a flexible framework for real-time guaranteed online admission control, supporting dynamic bandwidth allocation and reclamation at runtime without requiring global reconfiguration, thus improving scalability. Within this framework, we further integrate a novel strategy based on network calculus (NC) theory for efficient and high-utilization bandwidth reallocation. On the one hand, the strategy focuses on adaptively balancing residual bandwidth with deadline awareness to prevent bottleneck egress ports, thereby improving admission capacity. On the other hand, it employs a non-trivial analytical result to reduce the search space, accelerating the solving process. Experimental results from both large-scale synthetic and realistic test cases show that, compared to the state-of-the-art, our method achieves an average 44% increase in admitted flows and an average 92% reduction in admission time. Additionally, it postpones the occurrence of bottleneck egress ports and the first rejection of admission requests, thereby enhancing adaptability.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103539"},"PeriodicalIF":4.1,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144863746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SIRENA: SparsIty-REpetition aware Nibble-based hardware Accelerator for convolutional neural networks SIRENA:用于卷积神经网络的基于nibble的稀疏重复感知硬件加速器
IF 4.1 2区 计算机科学
Journal of Systems Architecture Pub Date : 2025-08-11 DOI: 10.1016/j.sysarc.2025.103529
Laura Medina, Jose Flich
{"title":"SIRENA: SparsIty-REpetition aware Nibble-based hardware Accelerator for convolutional neural networks","authors":"Laura Medina,&nbsp;Jose Flich","doi":"10.1016/j.sysarc.2025.103529","DOIUrl":"10.1016/j.sysarc.2025.103529","url":null,"abstract":"<div><div>The growing demand for artificial intelligence (AI) applications demands specialized hardware accelerators to handle intensive computational loads. To reduce computing needs, this paper introduces nibble decomposition (NBD), a method that splits 8-bit values into two 4-bit nibbles to detect and remove redundant computations in convolutional neural networks (CNNs). Experiments with INT8 quantized ResNet-50, MobileNet, and YOLO-V3 show that nibble decomposition can avoid up to 91% of multiplications in the upper nibble and 70% in the lower nibble.</div><div>We further propose SIRENA, an NBD hardware accelerator to optimize 8-bit quantized CNNs by skipping redundant operations without accuracy loss. Building on this method, we present SIRENA, an NBD-based accelerator that skips redundant operations without accuracy loss. Compared to a conventional value-agnostic accelerator, SIRENA achieves a 55% reduction in power consumption.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"168 ","pages":"Article 103529"},"PeriodicalIF":4.1,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144863747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信