{"title":"协同内存优化:异构内存层次的精确调优","authors":"Gabriele Magnani;Daniele Cattaneo;Lev Denisov;Giuseppe Tagliavini;Giovanni Agosta;Stefano Cherubin","doi":"10.1109/TC.2025.3586025","DOIUrl":null,"url":null,"abstract":"Balancing energy efficiency and high performance in embedded systems requires fine-tuning hardware and software components to co-optimize their interaction. In this work, we address the automated optimization of memory usage through a compiler toolchain that leverages DMA-aware precision tuning and mathematical function memorization. The proposed solution extends the <small>llvm</small> infrastructure, employing the <small>taffo</small> plugins for precision tuning, with the <small>SeTHet</small> extension for DMA-aware precision tuning and <small>luTHet</small> for automated, DMA-aware mathematical function memorization. We performed an experimental assessment on <small>hero</small>, a heterogeneous platform employing <small>risc-v</small> cores as a parallel accelerator. Our solution enables speedups ranging from <inline-formula><tex-math>$1.5\\boldsymbol{\\times}$</tex-math></inline-formula> to <inline-formula><tex-math>$51.1\\boldsymbol{\\times}$</tex-math></inline-formula> on AxBench benchmarks that employ trigonometrical functions and <inline-formula><tex-math>$4.23-48.4\\boldsymbol{\\times}$</tex-math></inline-formula> on Polybench benchmarks over the baseline <small>hero</small> platform.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 9","pages":"3168-3180"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Synergistic Memory Optimisations: Precision Tuning in Heterogeneous Memory Hierarchies\",\"authors\":\"Gabriele Magnani;Daniele Cattaneo;Lev Denisov;Giuseppe Tagliavini;Giovanni Agosta;Stefano Cherubin\",\"doi\":\"10.1109/TC.2025.3586025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Balancing energy efficiency and high performance in embedded systems requires fine-tuning hardware and software components to co-optimize their interaction. In this work, we address the automated optimization of memory usage through a compiler toolchain that leverages DMA-aware precision tuning and mathematical function memorization. The proposed solution extends the <small>llvm</small> infrastructure, employing the <small>taffo</small> plugins for precision tuning, with the <small>SeTHet</small> extension for DMA-aware precision tuning and <small>luTHet</small> for automated, DMA-aware mathematical function memorization. We performed an experimental assessment on <small>hero</small>, a heterogeneous platform employing <small>risc-v</small> cores as a parallel accelerator. Our solution enables speedups ranging from <inline-formula><tex-math>$1.5\\\\boldsymbol{\\\\times}$</tex-math></inline-formula> to <inline-formula><tex-math>$51.1\\\\boldsymbol{\\\\times}$</tex-math></inline-formula> on AxBench benchmarks that employ trigonometrical functions and <inline-formula><tex-math>$4.23-48.4\\\\boldsymbol{\\\\times}$</tex-math></inline-formula> on Polybench benchmarks over the baseline <small>hero</small> platform.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"74 9\",\"pages\":\"3168-3180\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11077359/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11077359/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Synergistic Memory Optimisations: Precision Tuning in Heterogeneous Memory Hierarchies
Balancing energy efficiency and high performance in embedded systems requires fine-tuning hardware and software components to co-optimize their interaction. In this work, we address the automated optimization of memory usage through a compiler toolchain that leverages DMA-aware precision tuning and mathematical function memorization. The proposed solution extends the llvm infrastructure, employing the taffo plugins for precision tuning, with the SeTHet extension for DMA-aware precision tuning and luTHet for automated, DMA-aware mathematical function memorization. We performed an experimental assessment on hero, a heterogeneous platform employing risc-v cores as a parallel accelerator. Our solution enables speedups ranging from $1.5\boldsymbol{\times}$ to $51.1\boldsymbol{\times}$ on AxBench benchmarks that employ trigonometrical functions and $4.23-48.4\boldsymbol{\times}$ on Polybench benchmarks over the baseline hero platform.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.