CoaT: Compiler-Assisted Two-Stage Offloading Approach for Data-Intensive Applications Under NMP Framework

IF 5.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Emerging Topics in Computing Pub Date : 2024-11-15 DOI:10.1109/TETC.2024.3495218

Satanu Maity;Mayank Goel;Manojit Ghose

{"title":"CoaT: Compiler-Assisted Two-Stage Offloading Approach for Data-Intensive Applications Under NMP Framework","authors":"Satanu Maity;Mayank Goel;Manojit Ghose","doi":"10.1109/TETC.2024.3495218","DOIUrl":null,"url":null,"abstract":"As we head toward a data-centric era, conventional computing systems become inadequate to meet the evolving demands of the applications. As a result, the near-memory processing (NMP) computing paradigm emerges as a potential alternative framework where regions of an application are offloaded for execution near the memory. Although some interesting research works have been proposed in recent times, none of them have considered placing processing cores jointly on the primary memories and cache memory. Further, they did not consider the data locality offered by the last level cache (LLC) and the estimated execution time of an application region together while designing the offloading strategy. This paper presents a novel hybrid NMP computation framework comprising a traditional multicore processor, NMP-enabled 3D memories and NMP-enabled LLC. The application source code is processed through a compilation framework to identify potential offloadable regions. The paper further proposes a two-stage offloading strategy, <italic>CoaT</i>, which determines the execution location of the application regions based on the region’s overall execution time and the data locality offered by the LLC. A comprehensive series of experiments conducted using well-established simulators for large data-intensive applications, provides strong evidence of the efficacy of our approach. The results demonstrate significant reductions in execution time (averaging 60% with a maximum reduction of 64%), un-core energy consumption (averaging 34% with a maximum reduction of 44%), and off-chip data block transfer count (averaging 61% with a maximum reduction of 80%) compared to the state-of-the-art policies. The proposed policy achieves a speedup of 2.6x (on average) and 3.1x (maximum) w.r.t. the conventional execution.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"13 3","pages":"753-767"},"PeriodicalIF":5.4000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10755004/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

As we head toward a data-centric era, conventional computing systems become inadequate to meet the evolving demands of the applications. As a result, the near-memory processing (NMP) computing paradigm emerges as a potential alternative framework where regions of an application are offloaded for execution near the memory. Although some interesting research works have been proposed in recent times, none of them have considered placing processing cores jointly on the primary memories and cache memory. Further, they did not consider the data locality offered by the last level cache (LLC) and the estimated execution time of an application region together while designing the offloading strategy. This paper presents a novel hybrid NMP computation framework comprising a traditional multicore processor, NMP-enabled 3D memories and NMP-enabled LLC. The application source code is processed through a compilation framework to identify potential offloadable regions. The paper further proposes a two-stage offloading strategy, CoaT, which determines the execution location of the application regions based on the region’s overall execution time and the data locality offered by the LLC. A comprehensive series of experiments conducted using well-established simulators for large data-intensive applications, provides strong evidence of the efficacy of our approach. The results demonstrate significant reductions in execution time (averaging 60% with a maximum reduction of 64%), un-core energy consumption (averaging 34% with a maximum reduction of 44%), and off-chip data block transfer count (averaging 61% with a maximum reduction of 80%) compared to the state-of-the-art policies. The proposed policy achieves a speedup of 2.6x (on average) and 3.1x (maximum) w.r.t. the conventional execution.

查看原文本刊更多论文

主题：NMP框架下数据密集型应用的编译器辅助两阶段卸载方法

随着我们走向以数据为中心的时代，传统的计算系统已不足以满足应用程序不断发展的需求。因此，近内存处理（NMP）计算范式作为一种潜在的替代框架出现，在这种框架中，应用程序的各个区域被卸载，以便在内存附近执行。虽然近年来提出了一些有趣的研究工作，但都没有考虑将处理核心放在主存储器和缓存存储器上。此外，他们在设计卸载策略时没有考虑最后一级缓存（LLC）提供的数据局部性和应用程序区域的估计执行时间。本文提出了一种新的混合NMP计算框架，该框架由传统的多核处理器、支持NMP的3D存储器和支持NMP的LLC组成。应用程序源代码通过编译框架进行处理，以识别潜在的可卸载区域。本文进一步提出了一种两阶段卸载策略，CoaT，该策略根据区域的总体执行时间和LLC提供的数据位置确定应用区域的执行位置。使用成熟的大型数据密集型应用模拟器进行的一系列综合实验提供了强有力的证据，证明了我们方法的有效性。结果表明，与最先进的策略相比，执行时间（平均减少60%，最大减少64%）、非核心能耗（平均减少34%，最大减少44%）和片外数据块传输计数（平均减少61%，最大减少80%）显著减少。与常规执行相比，所提出的策略实现了2.6倍（平均）和3.1倍（最大）的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computing Computer Science-Computer Science (miscellaneous)

CiteScore

12.10

自引率

5.10%

发文量

113

期刊介绍： IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.