IEEE Computer Architecture Letters最新文献_第7页

Empirical Architectural Analysis on Performance Scalability of Petascale All-Flash Storage Systems 有关 Petascale 全闪存存储系统性能可扩展性的经验架构分析

IF 1.4 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-06-25 DOI: 10.1109/LCA.2024.3418874

Mohammadamin Ajdari;Behrang Montazerzohour;Kimia Abdi;Hossein Asadi

引用次数: 0

Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture 加速以当代 GPU 微体系结构为目标的可编程引导

IF 1.4 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-06-24 DOI: 10.1109/LCA.2024.3418448

Hyesung Ji;Sangpyo Kim;Jaewan Choi;Jung Ho Ahn

{"title":"Accelerating Programmable Bootstrapping Targeting Contemporary GPU Microarchitecture","authors":"Hyesung Ji;Sangpyo Kim;Jaewan Choi;Jung Ho Ahn","doi":"10.1109/LCA.2024.3418448","DOIUrl":"10.1109/LCA.2024.3418448","url":null,"abstract":"Fully homomorphic encryption (FHE) enables computation on encrypted data without privacy leakage, among which GSW-based schemes are notable for supporting the evaluation of arbitrary univariate functions using programmable bootstrapping (PBS). Despite their wide applicability, their computational complexity in a single PBS impedes widespread adoption. However, at the application level, there are enough number of independent PBSs to achieve high data-level parallelism, making them suitable for running on GPUs known for their high parallel computing capability. On contemporary GPUs, peak integer performance has steadily increased, and the sizes of L2 cache and shared memory have also grown rapidly since the Volta architecture. Prior attempts to accelerate PBS on GPUs have fallen short due to their outdated implementations that cannot leverage recent GPU advances. In this paper, we introduce a GPU implementation that supports the latest PBS algorithm and incorporates GPU-trend-aware optimizations. Our implementation achieves a 10.8× performance improvement over the state-of-the-art (SOTA) GPU implementations on RTX 4090 and even outperforms the SOTA ASIC implementation.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"23 2","pages":"207-210"},"PeriodicalIF":1.4,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10570278","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141532506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TeleVM: A Lightweight Virtual Machine for RISC-V Architecture TeleVM：适用于 RISC-V 架构的轻量级虚拟机

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-30 DOI: 10.1109/LCA.2024.3394835

Tianzheng Li;Enfang Cui;Yuting Wu;Qian Wei;Yue Gao

{"title":"TeleVM: A Lightweight Virtual Machine for RISC-V Architecture","authors":"Tianzheng Li;Enfang Cui;Yuting Wu;Qian Wei;Yue Gao","doi":"10.1109/LCA.2024.3394835","DOIUrl":"10.1109/LCA.2024.3394835","url":null,"abstract":"Serverless computing has become an important paradigm in cloud computing due to its advantages such as fast large-scale deployment and pay-as-you-go charging model. Due to shared infrastructure and multi-tenant environments, serverless applications have high security requirements. Traditional virtual machines and containers cannot fully meet the requirements of serverless applications. Therefore, lightweight virtual machine technology has emerged, which can reduce overhead and boot time while ensuring security. In this letter, we propose TeleVM, a lightweight virtual machine for RISC-V architecture. TeleVM can achieve strong isolation through the hypervisor extension of RISC-V. Compared with traditional virtual machines, TeleVM only implements a small number of IO devices and functions, which can effectively reduce memory overhead and boot time. We compared TeleVM and QEMU+KVM through experiments. Compared to QEMU+KVM, the boot time and memory overhead of TeleVM have decreased by 74% and 90% respectively. This work further improves the cloud computing software ecosystem of RISC-V architecture and promotes the use of RISC-V architecture in cloud computing scenarios.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"23 1","pages":"121-124"},"PeriodicalIF":2.3,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140830077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Data Transfer Bottlenecks in Commercial PIM Systems: A Study With UPMEM-PIM 商业 PIM 系统中的数据传输瓶颈分析：UPMEM-PIM 研究

IF 1.4 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-12 DOI: 10.1109/LCA.2024.3387472

Dongjae Lee;Bongjoon Hyun;Taehun Kim;Minsoo Rhu

引用次数: 0

GATe: Streamlining Memory Access and Communication to Accelerate Graph Attention Network With Near-Memory Processing GATe：简化内存访问和通信，利用近记忆处理加速图形注意网络

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-10 DOI: 10.1109/LCA.2024.3386734

Shiyan Yi;Yudi Qiu;Lingfei Lu;Guohao Xu;Yong Gong;Xiaoyang Zeng;Yibo Fan

{"title":"GATe: Streamlining Memory Access and Communication to Accelerate Graph Attention Network With Near-Memory Processing","authors":"Shiyan Yi;Yudi Qiu;Lingfei Lu;Guohao Xu;Yong Gong;Xiaoyang Zeng;Yibo Fan","doi":"10.1109/LCA.2024.3386734","DOIUrl":"10.1109/LCA.2024.3386734","url":null,"abstract":"Graph Attention Network (GAT) has gained widespread adoption thanks to its exceptional performance. The critical components of a GAT model involve aggregation and attention, which cause numerous main-memory access. Recently, much research has proposed near-memory processing (NMP) architectures to accelerate aggregation. However, graph attention requires additional operations distinct from aggregation, making previous NMP architectures less suitable for supporting GAT. In this paper, we propose GATe, a practical and efficient \u0000<underline>GAT</u>\u0000 acc\u0000<underline>e</u>\u0000lerator with NMP architecture. To the best of our knowledge, this is the first time that accelerates both attention and aggregation computation on DIMM. In the attention and aggregation phases, we unify feature vector access to reduce repetitive memory accesses and refine the computation flow to reduce communication. Furthermore, we introduce a novel sharding method that enhances the data reusability. Experiments show that our work achieves substantial speedup of up to 6.77× and 2.46×, respectively, compared to state-of-the-art NMP works GNNear and GraNDe.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"23 1","pages":"87-90"},"PeriodicalIF":2.3,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Area Efficient Architecture of a Novel Chaotic System for High Randomness Security in e-Health 用于电子医疗高随机性安全的新型混沌系统的面积效率架构

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-10 DOI: 10.1109/LCA.2024.3387352

Kyriaki Tsantikidou;Nicolas Sklavos

引用次数: 0

The Importance of Generalizability in Machine Learning for Systems 系统机器学习中通用性的重要性

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-02 DOI: 10.1109/LCA.2024.3384449

Varun Gohil;Sundar Dev;Gaurang Upasani;David Lo;Parthasarathy Ranganathan;Christina Delimitrou

引用次数: 0

MajorK: Majority Based kmer Matching in Commodity DRAM MajorK：商品 DRAM 中基于多数的 kmer 匹配

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-04-02 DOI: 10.1109/LCA.2024.3384259

Z. Jahshan;L. Yavits

引用次数: 0

SLO-Aware GPU DVFS for Energy-Efficient LLM Inference Serving 面向高能效 LLM 推理服务的 SLO 感知 GPU DVFS

IF 1.4 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-03-28 DOI: 10.1109/LCA.2024.3406038

Andreas Kosmas Kakolyris;Dimosthenis Masouros;Sotirios Xydis;Dimitrios Soudris

引用次数: 0

Dramaton: A Near-DRAM Accelerator for Large Number Theoretic Transforms DRAMATON: 用于大数理论变换的近 DRAM 加速器

IF 2.3 3区计算机科学

IEEE Computer Architecture Letters Pub Date : 2024-03-27 DOI: 10.1109/LCA.2024.3381452

Yongmo Park;Subhankar Pal;Aporva Amarnath;Karthik Swaminathan;Wei D. Lu;Alper Buyuktosunoglu;Pradip Bose

引用次数: 0