{"title":"迈向节能科学计算:浮点运算中的可逆数值线性代数核","authors":"V. Dwarka","doi":"10.1016/j.suscom.2025.101261","DOIUrl":null,"url":null,"abstract":"<div><div>Frontier scientific and AI workloads now reach <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> fused multiply–add (FMA) operations per run (on the order of <span><math><mrow><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> FLOPs). At today’s <span><math><mrow><mo>∼</mo><mn>10</mn></mrow></math></span> <!--> <!-->pJ per FMA, this corresponds to approximately <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>8</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>14</mn></mrow></msup></mrow></math></span> joules of arithmetic energy. At this scale, energy becomes the limiting resource for continued growth in computational workloads, motivating a re-evaluation of long-standing algorithmic assumptions. It is often assumed that reversible computing only matters near the Landauer limit. Building on prior physical arguments that full energy recovery is only possible when computation preserves information, we demonstrate that this same requirement governs floating-point numerical kernels: overwriting state enforces a non-zero energy floor, even under ideal recovery. Thus, eliminating this wall in practice requires that the numerical algorithm itself be injective. We therefore present the <em>first</em> reversible floating-point realizations of core dense numerical kernels—matrix multiplication, LU factorization, and conjugate-gradient iteration—that retain rounding information rather than discarding it. Implemented directly in IEEE arithmetic, they achieve machine-precision forward–reverse agreement on well- and ill-conditioned problems with minimal auxiliary state. A toggle-based model with measured switching costs and realistic recovery factors predicts <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>3</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>4</mn></mrow></msup><mo>×</mo></mrow></math></span> reductions in arithmetic energy. These results establish injective floating-point kernels as a foundation for energy-recovering numerical computation, and indicate that realizing this potential will require sustained co-design across applied mathematics, computer science, and hardware engineering.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"49 ","pages":"Article 101261"},"PeriodicalIF":5.7000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards energy-efficient scientific computing: Reversible numerical linear algebra kernels in floating-point arithmetic\",\"authors\":\"V. Dwarka\",\"doi\":\"10.1016/j.suscom.2025.101261\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Frontier scientific and AI workloads now reach <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> fused multiply–add (FMA) operations per run (on the order of <span><math><mrow><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>19</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>2</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>25</mn></mrow></msup></mrow></math></span> FLOPs). At today’s <span><math><mrow><mo>∼</mo><mn>10</mn></mrow></math></span> <!--> <!-->pJ per FMA, this corresponds to approximately <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>8</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>14</mn></mrow></msup></mrow></math></span> joules of arithmetic energy. At this scale, energy becomes the limiting resource for continued growth in computational workloads, motivating a re-evaluation of long-standing algorithmic assumptions. It is often assumed that reversible computing only matters near the Landauer limit. Building on prior physical arguments that full energy recovery is only possible when computation preserves information, we demonstrate that this same requirement governs floating-point numerical kernels: overwriting state enforces a non-zero energy floor, even under ideal recovery. Thus, eliminating this wall in practice requires that the numerical algorithm itself be injective. We therefore present the <em>first</em> reversible floating-point realizations of core dense numerical kernels—matrix multiplication, LU factorization, and conjugate-gradient iteration—that retain rounding information rather than discarding it. Implemented directly in IEEE arithmetic, they achieve machine-precision forward–reverse agreement on well- and ill-conditioned problems with minimal auxiliary state. A toggle-based model with measured switching costs and realistic recovery factors predicts <span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>3</mn></mrow></msup><mspace></mspace><mo>−</mo><mspace></mspace><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>4</mn></mrow></msup><mo>×</mo></mrow></math></span> reductions in arithmetic energy. These results establish injective floating-point kernels as a foundation for energy-recovering numerical computation, and indicate that realizing this potential will require sustained co-design across applied mathematics, computer science, and hardware engineering.</div></div>\",\"PeriodicalId\":48686,\"journal\":{\"name\":\"Sustainable Computing-Informatics & Systems\",\"volume\":\"49 \",\"pages\":\"Article 101261\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2026-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sustainable Computing-Informatics & Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210537925001829\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/12/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537925001829","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/12/20 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Towards energy-efficient scientific computing: Reversible numerical linear algebra kernels in floating-point arithmetic
Frontier scientific and AI workloads now reach fused multiply–add (FMA) operations per run (on the order of FLOPs). At today’s pJ per FMA, this corresponds to approximately joules of arithmetic energy. At this scale, energy becomes the limiting resource for continued growth in computational workloads, motivating a re-evaluation of long-standing algorithmic assumptions. It is often assumed that reversible computing only matters near the Landauer limit. Building on prior physical arguments that full energy recovery is only possible when computation preserves information, we demonstrate that this same requirement governs floating-point numerical kernels: overwriting state enforces a non-zero energy floor, even under ideal recovery. Thus, eliminating this wall in practice requires that the numerical algorithm itself be injective. We therefore present the first reversible floating-point realizations of core dense numerical kernels—matrix multiplication, LU factorization, and conjugate-gradient iteration—that retain rounding information rather than discarding it. Implemented directly in IEEE arithmetic, they achieve machine-precision forward–reverse agreement on well- and ill-conditioned problems with minimal auxiliary state. A toggle-based model with measured switching costs and realistic recovery factors predicts reductions in arithmetic energy. These results establish injective floating-point kernels as a foundation for energy-recovering numerical computation, and indicate that realizing this potential will require sustained co-design across applied mathematics, computer science, and hardware engineering.
期刊介绍:
Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.