UVMMU: Hardware-Offloaded Page Migration for Heterogeneous Computing

2023 Design, Automation & Test in Europe Conference & Exhibition (DATE) Pub Date : 2023-04-01 DOI:10.23919/DATE56975.2023.10137307

Jihun Park, Donghun Jeong, Jungrae Kim

引用次数: 0

Abstract

In a heterogeneous computing system with multiple memories, placing data near its current processing unit and migrating data over time can significantly improve performance. GPU vendors have introduced Unified Memory (UM) to automate data migrations between CPU and GPU memories and support memory over-subscription. Although UM improves software programmability, it can incur high costs due to its software-based migration. We propose a novel architecture to offload the migration to hardware and minimize UM overheads. Unified Virtual Memory Management Unit (UVMMU) detects access to remote memories and migrates pages without software intervention. By replacing page faults and software handling with hardware offloading, UVMMU can reduce the page migration latency to a few $\mu s$. Our evaluation shows that UVMMU can achieve 1.59× and 2.40× speed-ups over the state-of-the-art UM solutions for no over-subscription and 150% over-subscription, respectively.

查看原文本刊更多论文

UVMMU:异构计算的硬件卸载页面迁移

在具有多个内存的异构计算系统中，将数据放在当前处理单元附近并随时间迁移数据可以显著提高性能。GPU厂商已经引入了统一内存(UM)来自动实现CPU和GPU内存之间的数据迁移，并支持内存超额订阅。尽管UM提高了软件的可编程性，但由于其基于软件的迁移，它可能会招致高昂的成本。我们提出了一种新的架构，将迁移转移到硬件上，并将UM开销降至最低。统一虚拟内存管理单元(UVMMU)检测对远程内存的访问，并在没有软件干预的情况下迁移页面。通过用硬件卸载取代页面错误和软件处理，UVMMU可以将页面迁移延迟减少到几毫秒。我们的评估表明，与最先进的UM解决方案相比，UVMMU在无超额订阅和超额订阅150%的情况下分别可以实现1.59倍和2.40倍的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)

自引率

0.00%

发文量