{"title":"Hamun:一种近似计算方法来延长基于reram的加速器的寿命","authors":"Mohammad Sabri, Marc Riera, Antonio González","doi":"10.1016/j.sysarc.2025.103444","DOIUrl":null,"url":null,"abstract":"<div><div>ReRAM-based accelerators exhibit enormous potential to increase computational efficiency for DNN inference tasks, delivering significant performance and energy savings over traditional platforms. By incorporating adaptive scheduling, these accelerators dynamically adjust to DNN requirements, optimizing allocation of constrained hardware resources. However, ReRAM cells have limited endurance cycles due to wear-out from multiple updates for each inference execution, which shortens the lifespan of ReRAM-based accelerators and presents a practical challenge in positioning them as alternatives to conventional platforms like TPUs. Addressing these endurance limitations is essential for making ReRAM-based solutions viable for long-term, high-performance DNN inference.</div><div>To address the lifespan limitations of ReRAM-based accelerators, we introduce <em>Hamun</em>, an approximate computing method designed to extend the lifespan of ReRAM-based accelerators through a range of optimizations. Hamun incorporates a novel mechanism that detects faulty cells due to wear-out and retires them, avoiding in this way their otherwise adverse impact on DNN accuracy. Moreover, Hamun extends the lifespan of ReRAM-based accelerators by adapting wear-leveling techniques across various abstraction levels of the accelerator and implementing a batch execution scheme to maximize ReRAM cell usage for multiple inferences. Additionally, Hamun introduces a new approximation method that leverages the fault tolerance characteristics of DNNs to delay the retirement of worn-out cells, reducing the performance penalty of retired cells and further extending the accelerator’s lifespan. On average, evaluated on a set of popular DNNs, Hamun demonstrates an improvement in lifespan of <span><math><mrow><mn>13</mn><mo>.</mo><mn>2</mn><mo>×</mo></mrow></math></span> over a state-of-the-art baseline. The main contributors to this improvement are the fault handling and batch execution schemes, which provide <span><math><mrow><mn>4</mn><mo>.</mo><mn>6</mn><mo>×</mo></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>6</mn><mo>×</mo></mrow></math></span> lifespan improvements respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"166 ","pages":"Article 103444"},"PeriodicalIF":4.1000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hamun: An approximate computing method to prolong the lifespan of ReRAM-based accelerators\",\"authors\":\"Mohammad Sabri, Marc Riera, Antonio González\",\"doi\":\"10.1016/j.sysarc.2025.103444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>ReRAM-based accelerators exhibit enormous potential to increase computational efficiency for DNN inference tasks, delivering significant performance and energy savings over traditional platforms. By incorporating adaptive scheduling, these accelerators dynamically adjust to DNN requirements, optimizing allocation of constrained hardware resources. However, ReRAM cells have limited endurance cycles due to wear-out from multiple updates for each inference execution, which shortens the lifespan of ReRAM-based accelerators and presents a practical challenge in positioning them as alternatives to conventional platforms like TPUs. Addressing these endurance limitations is essential for making ReRAM-based solutions viable for long-term, high-performance DNN inference.</div><div>To address the lifespan limitations of ReRAM-based accelerators, we introduce <em>Hamun</em>, an approximate computing method designed to extend the lifespan of ReRAM-based accelerators through a range of optimizations. Hamun incorporates a novel mechanism that detects faulty cells due to wear-out and retires them, avoiding in this way their otherwise adverse impact on DNN accuracy. Moreover, Hamun extends the lifespan of ReRAM-based accelerators by adapting wear-leveling techniques across various abstraction levels of the accelerator and implementing a batch execution scheme to maximize ReRAM cell usage for multiple inferences. Additionally, Hamun introduces a new approximation method that leverages the fault tolerance characteristics of DNNs to delay the retirement of worn-out cells, reducing the performance penalty of retired cells and further extending the accelerator’s lifespan. On average, evaluated on a set of popular DNNs, Hamun demonstrates an improvement in lifespan of <span><math><mrow><mn>13</mn><mo>.</mo><mn>2</mn><mo>×</mo></mrow></math></span> over a state-of-the-art baseline. The main contributors to this improvement are the fault handling and batch execution schemes, which provide <span><math><mrow><mn>4</mn><mo>.</mo><mn>6</mn><mo>×</mo></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>6</mn><mo>×</mo></mrow></math></span> lifespan improvements respectively.</div></div>\",\"PeriodicalId\":50027,\"journal\":{\"name\":\"Journal of Systems Architecture\",\"volume\":\"166 \",\"pages\":\"Article 103444\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Systems Architecture\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S138376212500116X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S138376212500116X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Hamun: An approximate computing method to prolong the lifespan of ReRAM-based accelerators
ReRAM-based accelerators exhibit enormous potential to increase computational efficiency for DNN inference tasks, delivering significant performance and energy savings over traditional platforms. By incorporating adaptive scheduling, these accelerators dynamically adjust to DNN requirements, optimizing allocation of constrained hardware resources. However, ReRAM cells have limited endurance cycles due to wear-out from multiple updates for each inference execution, which shortens the lifespan of ReRAM-based accelerators and presents a practical challenge in positioning them as alternatives to conventional platforms like TPUs. Addressing these endurance limitations is essential for making ReRAM-based solutions viable for long-term, high-performance DNN inference.
To address the lifespan limitations of ReRAM-based accelerators, we introduce Hamun, an approximate computing method designed to extend the lifespan of ReRAM-based accelerators through a range of optimizations. Hamun incorporates a novel mechanism that detects faulty cells due to wear-out and retires them, avoiding in this way their otherwise adverse impact on DNN accuracy. Moreover, Hamun extends the lifespan of ReRAM-based accelerators by adapting wear-leveling techniques across various abstraction levels of the accelerator and implementing a batch execution scheme to maximize ReRAM cell usage for multiple inferences. Additionally, Hamun introduces a new approximation method that leverages the fault tolerance characteristics of DNNs to delay the retirement of worn-out cells, reducing the performance penalty of retired cells and further extending the accelerator’s lifespan. On average, evaluated on a set of popular DNNs, Hamun demonstrates an improvement in lifespan of over a state-of-the-art baseline. The main contributors to this improvement are the fault handling and batch execution schemes, which provide and lifespan improvements respectively.
期刊介绍:
The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software.
Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.