Towards energy-efficient scientific computing: Reversible numerical linear algebra kernels in floating-point arithmetic

IF 5.7 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
V. Dwarka
{"title":"Towards energy-efficient scientific computing: Reversible numerical linear algebra kernels in floating-point arithmetic","authors":"V. Dwarka","doi":"10.1016/j.suscom.2025.101261","DOIUrl":null,"url":null,"abstract":"<div><div>Frontier scientific and AI workloads now reach <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> fused multiply–add (FMA) operations per run (on the order of <span><math><mrow><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> FLOPs). At today’s <span><math><mrow><mo>∼</mo><mn>10</mn></mrow></math></span> <!--> <!-->pJ per FMA, this corresponds to approximately <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>8</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>14</mn></mrow></msup></mrow></math></span> joules of arithmetic energy. At this scale, energy becomes the limiting resource for continued growth in computational workloads, motivating a re-evaluation of long-standing algorithmic assumptions. It is often assumed that reversible computing only matters near the Landauer limit. Building on prior physical arguments that full energy recovery is only possible when computation preserves information, we demonstrate that this same requirement governs floating-point numerical kernels: overwriting state enforces a non-zero energy floor, even under ideal recovery. Thus, eliminating this wall in practice requires that the numerical algorithm itself be injective. We therefore present the <em>first</em> reversible floating-point realizations of core dense numerical kernels—matrix multiplication, LU factorization, and conjugate-gradient iteration—that retain rounding information rather than discarding it. Implemented directly in IEEE arithmetic, they achieve machine-precision forward–reverse agreement on well- and ill-conditioned problems with minimal auxiliary state. A toggle-based model with measured switching costs and realistic recovery factors predicts <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>3</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>4</mn></mrow></msup><mo>×</mo></mrow></math></span> reductions in arithmetic energy. These results establish injective floating-point kernels as a foundation for energy-recovering numerical computation, and indicate that realizing this potential will require sustained co-design across applied mathematics, computer science, and hardware engineering.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"49 ","pages":"Article 101261"},"PeriodicalIF":5.7000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537925001829","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/12/20 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Frontier scientific and AI workloads now reach 10191025 fused multiply–add (FMA) operations per run (on the order of 2×10192×1025 FLOPs). At today’s 10  pJ per FMA, this corresponds to approximately 1081014 joules of arithmetic energy. At this scale, energy becomes the limiting resource for continued growth in computational workloads, motivating a re-evaluation of long-standing algorithmic assumptions. It is often assumed that reversible computing only matters near the Landauer limit. Building on prior physical arguments that full energy recovery is only possible when computation preserves information, we demonstrate that this same requirement governs floating-point numerical kernels: overwriting state enforces a non-zero energy floor, even under ideal recovery. Thus, eliminating this wall in practice requires that the numerical algorithm itself be injective. We therefore present the first reversible floating-point realizations of core dense numerical kernels—matrix multiplication, LU factorization, and conjugate-gradient iteration—that retain rounding information rather than discarding it. Implemented directly in IEEE arithmetic, they achieve machine-precision forward–reverse agreement on well- and ill-conditioned problems with minimal auxiliary state. A toggle-based model with measured switching costs and realistic recovery factors predicts 103104× reductions in arithmetic energy. These results establish injective floating-point kernels as a foundation for energy-recovering numerical computation, and indicate that realizing this potential will require sustained co-design across applied mathematics, computer science, and hardware engineering.
迈向节能科学计算:浮点运算中的可逆数值线性代数核
前沿科学和人工智能工作负载现在达到每次运行1019−1025次融合乘加(FMA)运算(顺序为2×1019−2×1025 FLOPs)。在今天的~ 10 pJ / FMA下,这相当于大约108−1014焦耳的算术能量。在这种规模下,能源成为计算工作量持续增长的限制资源,促使人们对长期存在的算法假设进行重新评估。通常假设可逆计算只在兰道尔极限附近起作用。基于先前的物理论据,即只有在计算保留信息时才有可能完全恢复能量,我们证明了浮点数值核也有同样的要求:即使在理想的恢复情况下,覆盖状态也会强制实现非零能量底限。因此,在实践中消除这堵墙需要数值算法本身是内射的。因此,我们提出了核心密集数值核的第一个可逆浮点实现-矩阵乘法,LU分解和共轭梯度迭代-保留舍入信息而不是丢弃它。它们直接在IEEE算法中实现,以最小的辅助状态实现对良好和病态问题的机器精度的正反向一致。一个基于开关的模型与测量的开关成本和现实的恢复因子预测103 - 104倍的算术能量降低。这些结果确立了注入浮点核作为能量回收数值计算的基础,并表明实现这一潜力将需要应用数学、计算机科学和硬件工程之间持续的协同设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Sustainable Computing-Informatics & Systems
Sustainable Computing-Informatics & Systems COMPUTER SCIENCE, HARDWARE & ARCHITECTUREC-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
10.70
自引率
4.40%
发文量
142
期刊介绍: Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书