DBB-ECC: Random Double Bit and Burst Error Correction Code for HBM3

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Chaehyeon Shin;Jongsun Park
{"title":"DBB-ECC: Random Double Bit and Burst Error Correction Code for HBM3","authors":"Chaehyeon Shin;Jongsun Park","doi":"10.1109/TCAD.2025.3544964","DOIUrl":null,"url":null,"abstract":"As dynamic random access memory (DRAM) technology continues to scale down, DRAM vendors have adopted on-die error correction codes (on-die ECC) to address reliability problems caused by cell failures. For burst error correction, a single symbol correction (SSC) Reed-Solomon (RS) code is utilized in high bandwidth memory (HBM) 3. However, randomly scattered errors frequently occur with aggressive technology scaling, which necessitates more robust error correction codes (ECC) scheme that addresses both burst errors and scattered errors. This brief presents double bit and burst ECC (DBB-ECC), an efficient scheme designed to correct both single symbol errors and random double bit errors with reduced implementation overhead. In the proposed decoding, syndromes based on SSC RS codes are used to address both error types without increasing parity bits. The decoder complexity has been also reduced by exploiting the syndrome patterns of double bit errors. The experimental results show that the proposed solution needs lower implementation overhead than conventional ones while maintaining same level of correction capability. Compared to the conventional SSC code, it also significantly enhances HBM3 reliability without increasing storage overhead.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"44 8","pages":"3236-3240"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10899823/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

As dynamic random access memory (DRAM) technology continues to scale down, DRAM vendors have adopted on-die error correction codes (on-die ECC) to address reliability problems caused by cell failures. For burst error correction, a single symbol correction (SSC) Reed-Solomon (RS) code is utilized in high bandwidth memory (HBM) 3. However, randomly scattered errors frequently occur with aggressive technology scaling, which necessitates more robust error correction codes (ECC) scheme that addresses both burst errors and scattered errors. This brief presents double bit and burst ECC (DBB-ECC), an efficient scheme designed to correct both single symbol errors and random double bit errors with reduced implementation overhead. In the proposed decoding, syndromes based on SSC RS codes are used to address both error types without increasing parity bits. The decoder complexity has been also reduced by exploiting the syndrome patterns of double bit errors. The experimental results show that the proposed solution needs lower implementation overhead than conventional ones while maintaining same level of correction capability. Compared to the conventional SSC code, it also significantly enhances HBM3 reliability without increasing storage overhead.
DBB-ECC: HBM3的随机双比特和突发纠错码
随着动态随机存取存储器(DRAM)技术的不断发展,DRAM厂商已经开始采用片上纠错码(on-die ECC)来解决由单元故障引起的可靠性问题。对于突发纠错,在高带宽存储器(HBM)中使用单符号纠错(SSC)里德-所罗门(RS)码3。然而,在积极的技术扩展中,随机分散错误经常发生,这就需要更健壮的纠错码(ECC)方案来解决突发错误和分散错误。本文简要介绍了双比特和突发ECC (DBB-ECC),这是一种有效的方案,旨在纠正单符号错误和随机双比特错误,减少了实现开销。在提出的解码中,基于SSC RS码的综合征被用来解决两种错误类型,而不增加奇偶校验位。通过利用双比特错误的综合征模式,也降低了解码器的复杂性。实验结果表明,该方法在保持相同校正能力的前提下,实现开销比传统方法低。与传统的SSC代码相比,它还在不增加存储开销的情况下显著提高了HBM3的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信