{"title":"UVMMU: Hardware-Offloaded Page Migration for Heterogeneous Computing","authors":"Jihun Park, Donghun Jeong, Jungrae Kim","doi":"10.23919/DATE56975.2023.10137307","DOIUrl":null,"url":null,"abstract":"In a heterogeneous computing system with multiple memories, placing data near its current processing unit and migrating data over time can significantly improve performance. GPU vendors have introduced Unified Memory (UM) to automate data migrations between CPU and GPU memories and support memory over-subscription. Although UM improves software programmability, it can incur high costs due to its software-based migration. We propose a novel architecture to offload the migration to hardware and minimize UM overheads. Unified Virtual Memory Management Unit (UVMMU) detects access to remote memories and migrates pages without software intervention. By replacing page faults and software handling with hardware offloading, UVMMU can reduce the page migration latency to a few $\\mu s$. Our evaluation shows that UVMMU can achieve 1.59× and 2.40× speed-ups over the state-of-the-art UM solutions for no over-subscription and 150% over-subscription, respectively.","PeriodicalId":340349,"journal":{"name":"2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/DATE56975.2023.10137307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In a heterogeneous computing system with multiple memories, placing data near its current processing unit and migrating data over time can significantly improve performance. GPU vendors have introduced Unified Memory (UM) to automate data migrations between CPU and GPU memories and support memory over-subscription. Although UM improves software programmability, it can incur high costs due to its software-based migration. We propose a novel architecture to offload the migration to hardware and minimize UM overheads. Unified Virtual Memory Management Unit (UVMMU) detects access to remote memories and migrates pages without software intervention. By replacing page faults and software handling with hardware offloading, UVMMU can reduce the page migration latency to a few $\mu s$. Our evaluation shows that UVMMU can achieve 1.59× and 2.40× speed-ups over the state-of-the-art UM solutions for no over-subscription and 150% over-subscription, respectively.