IEEE Computer Architecture Letters最新文献

筛选
英文 中文
X-ray: Discovering DRAM Internal Structure and Error Characteristics by Issuing Memory Commands X射线:通过发出内存命令发现DRAM内部结构和错误特征
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-07-17 DOI: 10.1109/LCA.2023.3296153
Hwayong Nam;Seungmin Baek;Minbok Wi;Michael Jaemin Kim;Jaehyun Park;Chihun Song;Nam Sung Kim;Jung Ho Ahn
{"title":"X-ray: Discovering DRAM Internal Structure and Error Characteristics by Issuing Memory Commands","authors":"Hwayong Nam;Seungmin Baek;Minbok Wi;Michael Jaemin Kim;Jaehyun Park;Chihun Song;Nam Sung Kim;Jung Ho Ahn","doi":"10.1109/LCA.2023.3296153","DOIUrl":"10.1109/LCA.2023.3296153","url":null,"abstract":"The demand for accurate information about the internal structure and characteristics of DRAM has been on the rise. Recent studies have explored the structure and characteristics of DRAM to improve processing in memory, enhance reliability, and mitigate a vulnerability known as rowhammer. However, DRAM manufacturers only disclose limited information through official documents, making it difficult to find specific information about actual DRAM devices. This paper presents reliable findings on the internal structure and characteristics of DRAM using activate-induced bitflips (AIBs), retention time test, and row-copy operation. While previous studies have attempted to understand the internal behaviors of DRAM devices, they have only shown results without identifying the causes or have analyzed DRAM modules rather than individual chips. We first uncover the size, structure, and operation of DRAM subarrays and verify our findings on the characteristics of DRAM. Then, we correct misunderstood information related to AIBs and demonstrate experimental results supporting the cause of rowhammer.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49364167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads 针对I/O密集型工作负载的泄漏感知直接I/O
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-07-03 DOI: 10.1109/LCA.2023.3290427
Ipoom Jeong;Jiaqi Lou;Yongseok Son;Yongjoo Park;Yifan Yuan;Nam Sung Kim
{"title":"LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads","authors":"Ipoom Jeong;Jiaqi Lou;Yongseok Son;Yongjoo Park;Yifan Yuan;Nam Sung Kim","doi":"10.1109/LCA.2023.3290427","DOIUrl":"10.1109/LCA.2023.3290427","url":null,"abstract":"The advancement in I/O technology has posed an unprecedented demand for high-performance processing on I/O data, leading to the development of Data Direct I/O (DDIO) technology. DDIO improves I/O processing efficiency by directly injecting all inbound I/O data into the last-level cache (LLC) in cooperation with any type of I/O device. Nonetheless, in certain scenarios with more than one I/O applications, DDIO may have sub-optimal performance caused by interference inside the LLC, resulting in the degradation of system performance. Especially, in this paper, we demonstrate that storage I/O on modern high-performance NVMe SSDs hardly benefits from DDIO, sometimes causing inefficient use of the shared LLC due to the “leaky DMA problem”. To address this problem, we propose \u0000<monospace>LADIO</monospace>\u0000, an adaptive approach that mitigates inter-application interference by dynamically controlling the DDIO functionality and reallocating LLC ways based on the leakage and locality of storage I/O data, respectively. In scenarios with heavy I/O interference, \u0000<monospace>LADIO</monospace>\u0000 improves the throughput of network-intensive applications by 20% while maintaining that of storage-intensive applications.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48507765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T-CAT: Dynamic Cache Allocation for Tiered Memory Systems With Memory Interleaving T-CAT:具有内存交错的分层内存系统的动态缓存分配
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-28 DOI: 10.1109/LCA.2023.3290197
Hwanjun Lee;Seunghak Lee;Yeji Jung;Daehoon Kim
{"title":"T-CAT: Dynamic Cache Allocation for Tiered Memory Systems With Memory Interleaving","authors":"Hwanjun Lee;Seunghak Lee;Yeji Jung;Daehoon Kim","doi":"10.1109/LCA.2023.3290197","DOIUrl":"10.1109/LCA.2023.3290197","url":null,"abstract":"New memory interconnect technology, such as Intel's Compute Express Link (CXL), helps to expand memory bandwidth and capacity by adding CPU-less NUMA nodes to the main memory system, addressing the growing memory wall challenge. Consequently, modern computing systems embrace the heterogeneity in memory systems, composing the memory systems with a tiered memory system with near and far memory (e.g., local DRAM and CXL-DRAM). However, adopting NUMA interleaving, which can improve performance by exploiting node-level parallelism and utilizing aggregate bandwidth, to the tiered memory systems can face challenges due to differences in the access latency between the two types of memory, leading to potential performance degradation for memory-intensive workloads. By tackling the challenges, we first investigate the effects of the NUMA interleaving on the performance of the tiered memory systems. We observe that while NUMA interleaving is essential for applications demanding high memory bandwidth, it can negatively impact the performance of applications demanding low memory bandwidth. Next, we propose a dynamic cache management, called \u0000<monospace>T-CAT</monospace>\u0000, which partitions the last-level cache between near and far memory, aiming to mitigate performance degradation by accessing far memory. \u0000<monospace>T-CAT</monospace>\u0000 attempts to reduce the difference in the average access latency between near and far memory by re-sizing the cache partitions. Through dynamic cache management, \u0000<monospace>T-CAT</monospace>\u0000 can preserve the performance benefits of NUMA interleaving while mitigating performance degradation by the far memory accesses. Our experimental results show that \u0000<monospace>T-CAT</monospace>\u0000 improves performance by up to 17% compared to cases with NUMA interleaving without the cache management.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48978905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hardware Accelerated Reusable Merkle Tree Generation for Bitcoin Blockchain Headers 用于比特币区块链头的硬件加速可重用Merkle树生成
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-28 DOI: 10.1109/LCA.2023.3289515
Kiseok Jeon;Junghee Lee;Bumsoo Kim;James J. Kim
{"title":"Hardware Accelerated Reusable Merkle Tree Generation for Bitcoin Blockchain Headers","authors":"Kiseok Jeon;Junghee Lee;Bumsoo Kim;James J. Kim","doi":"10.1109/LCA.2023.3289515","DOIUrl":"10.1109/LCA.2023.3289515","url":null,"abstract":"As the value of Bitcoin increases, the difficulty level of mining keeps increasing. This is generally addressed with application-specific integrated circuits (ASIC), but block candidates are still created by the software. The overhead of block candidate generation is relatively growing because the hash computation is boosted by ASIC. Additionally, it is getting harder to find the target nonce; If it is not found for a block candidate, a new block candidate must be generated. A new candidate can be generated to reduce the overhead of block candidate generation by modifying the coinbase without selecting and verifying transactions again. To this end, we propose a hardware accelerator for generating Merkle trees efficiently. The hash computation for Merkle tree generation is conducted with ASIC to reduce the overhead of block candidate generation, and the tree with only the modified coinbase is rapidly regenerated by reusing the intermediate results of the previously generated tree. Our simulation results demonstrate that the execution time can be reduced by up to 98.92% and power consumption by up to 99.73% when the number of transactions in a tree is 2048.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41591979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Containerized In-Storage Processing Model and Hardware Acceleration for Fully-Flexible Computational SSDs 用于全灵活计算固态硬盘的容器化存储内处理模型和硬件加速
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-27 DOI: 10.1109/lca.2023.3289828
Donghyun Gouk, Miryeong Kwon, Hanyeoreum Bae, Myoungsoo Jung
{"title":"Containerized In-Storage Processing Model and Hardware Acceleration for Fully-Flexible Computational SSDs","authors":"Donghyun Gouk, Miryeong Kwon, Hanyeoreum Bae, Myoungsoo Jung","doi":"10.1109/lca.2023.3289828","DOIUrl":"https://doi.org/10.1109/lca.2023.3289828","url":null,"abstract":"In-storage processing (ISP) efficiently examines large datasets but faces performance and security challenges. We introduce DockerSSD, a flexible ISP model that runs various applications near flash without modification. It employs lightweight OS-level virtualization in modern SSDs for faster ISP and better storage intelligence with a high flexiblity. DockerSSD reuses existing Docker container images for real-time data processing without altering the storage interface or runtime. Our design includes a new communication method and virtual firmware, alongside automated container-related network and I/O handling hardware. DockerSSD achieves a 2× speed improvement and reduces system-level power by 35.7%, on average.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guard Cache: Creating Noisy Side-Channels 保护缓存:创建有噪声的侧通道
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-27 DOI: 10.1109/LCA.2023.3289710
Fernando Mosquera;Krishna Kavi;Gayatri Mehta;Lizy John
{"title":"Guard Cache: Creating Noisy Side-Channels","authors":"Fernando Mosquera;Krishna Kavi;Gayatri Mehta;Lizy John","doi":"10.1109/LCA.2023.3289710","DOIUrl":"10.1109/LCA.2023.3289710","url":null,"abstract":"Microarchitectural innovations such as deep cache hierarchies, out-of-order execution, branch prediction and speculative execution have made possible the design of processors that meet ever-increasing demands for performance. However, these innovations have inadvertently introduced vulnerabilities, which are exploited by side-channel attacks and attacks relying on speculative executions. Mitigating the attacks while preserving the performance has been a challenge. In this letter we present an approach to obfuscate cache timing, making it more difficult for side-channel attacks to succeed. We create \u0000<italic>false cache hits</i>\u0000 using a small \u0000<italic>Guard Cache</i>\u0000 with randomization, and \u0000<italic>false cache misses</i>\u0000 by randomly evicting cache lines. We show that our \u0000<italic>false hits</i>\u0000 and \u0000<italic>false misses</i>\u0000 cause very minimal performance penalties and our obfuscation can make it difficult for common side-channel attacks such as Prime &Probe, Flush &Reload or Evict &Time to succeed.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41685733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TURBULENCE: Complexity-Effective Out-of-Order Execution on GPU With Distance-Based ISA TURBULENCE:利用基于距离的 ISA 在 GPU 上进行具有完备性的无序执行
IF 1.4 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-26 DOI: 10.1109/LCA.2023.3289317
Reoma Matsuo;Toru Koizumi;Hidetsugu Irie;Shuichi Sakai;Ryota Shioya
{"title":"TURBULENCE: Complexity-Effective Out-of-Order Execution on GPU With Distance-Based ISA","authors":"Reoma Matsuo;Toru Koizumi;Hidetsugu Irie;Shuichi Sakai;Ryota Shioya","doi":"10.1109/LCA.2023.3289317","DOIUrl":"10.1109/LCA.2023.3289317","url":null,"abstract":"A graphics processing unit (GPU) is a processor that achieves high throughput by exploiting data parallelism. We found that many GPU workloads also contain instruction-level parallelism that can be extracted through out-of-order execution to provide additional performance improvement opportunities. We propose the TURBULENCE architecture for very low-cost out-of-order execution on GPUs. TURBULENCE consists of a novel ISA that introduces the concept of referencing operands by inter-instruction distance instead of register numbers, and a novel microarchitecture that executes the novel ISA. This distance-based operand has the property of not causing false dependencies. By exploiting this property, we achieve cost-effective out-of-order execution on GPUs without introducing expensive hardware such as a rename logic and a load-store queue. Simulation results show that TURBULENCE improves performance by 17.6% without increasing energy consumption over an existing GPU.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142183933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Practical 128-Bit General Purpose Microarchitectures 面向实用的128位通用微体系结构
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-20 DOI: 10.1109/LCA.2023.3287762
Chandana S. Deshpande;Arthur Perais;Frédéric Pétrot
{"title":"Toward Practical 128-Bit General Purpose Microarchitectures","authors":"Chandana S. Deshpande;Arthur Perais;Frédéric Pétrot","doi":"10.1109/LCA.2023.3287762","DOIUrl":"10.1109/LCA.2023.3287762","url":null,"abstract":"Intel introduced 5-level paging mode to support 57-bit virtual address space in 2017. This, coupled to paradigms where backup storage can be accessed through load and store instructions (e.g., non volatile memories), lets us envision a future in which a 64-bit address space has become insufficient. In that event, the straightforward solution would be to adopt a flat 128-bit address space. In this early stage letter, we conduct high-level experiments that lead us to suggest a possible general-purpose processor micro-architecture providing 128-bit support with limited hardware cost.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48523174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DVFaaS: Leveraging DVFS for FaaS Workflows DVFaaS:为FaaS工作流利用DVFS
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-20 DOI: 10.1109/LCA.2023.3288089
Achilleas Tzenetopoulos;Dimosthenis Masouros;Dimitrios Soudris;Sotirios Xydis
{"title":"DVFaaS: Leveraging DVFS for FaaS Workflows","authors":"Achilleas Tzenetopoulos;Dimosthenis Masouros;Dimitrios Soudris;Sotirios Xydis","doi":"10.1109/LCA.2023.3288089","DOIUrl":"10.1109/LCA.2023.3288089","url":null,"abstract":"In this letter, we propose \u0000<italic>DVFaaS</i>\u0000, a per-core DVFS framework that utilizes control systems theory to assign \u0000<italic>just-enough</i>\u0000 frequency for the purpose of addressing the QoS requirements on serverless workflows comprising unseen functions. \u0000<italic>DVFaaS</i>\u0000 exploits the intermittent nature of serverless workflows, which enables staged control on distinguishable functions, which jointly contribute to the end-to-end latency. Our results show that \u0000<italic>DVFaaS</i>\u0000 considerably outperforms related work, reducing power consumption by up to 22%, with 2x fewer QoS violations.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46172046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a High-Performance, High-Endurance Key-Value SSD for Large-Key Workloads 面向大密钥工作负载的高性能、高持久键值SSD设计
IF 2.3 3区 计算机科学
IEEE Computer Architecture Letters Pub Date : 2023-06-02 DOI: 10.1109/LCA.2023.3282276
Chanyoung Park;Chun-Yi Liu;Kyungtae Kang;Mahmut Kandemir;Wonil Choi
{"title":"Design of a High-Performance, High-Endurance Key-Value SSD for Large-Key Workloads","authors":"Chanyoung Park;Chun-Yi Liu;Kyungtae Kang;Mahmut Kandemir;Wonil Choi","doi":"10.1109/LCA.2023.3282276","DOIUrl":"https://doi.org/10.1109/LCA.2023.3282276","url":null,"abstract":"Current KV-SSD design assumes a specific range of typical workloads, where the size of values is quite large while that of keys is relatively small. However, we find that (i) there exist another spectrum of workloads, whose key sizes are relatively large, compared to their value sizes, and (ii) the current KV-SSD design suffers from long tail latencies and low storage utilization under such large-key workloads. To this end, we present novel design of a KV-SSD (called LK-SSD), which can reduce tail latences and increase storage utilization under large-key workloads, and add an enhancement to it for longer device lifetime. Through extensive experiments, we show that LK-SSD is more suitable design for the large-key workloads, and also available for the typical workloads.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49962230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信