{"title":"通过透明分页向NVMe SSD启用GPU内存超额订阅","authors":"Joshua Bakita, James H. Anderson","doi":"10.1109/RTSS55097.2022.00039","DOIUrl":null,"url":null,"abstract":"Safety-critical embedded systems are experiencing increasing computational and memory demands as edge-computing and autonomous systems gain adoption. Main memory (DRAM) is often scarce, and existing mechanisms to support DRAM oversubscription, such as demand paging or compile-time transformations, either imply serious CPU capacity loss, or put unacceptable constraints on program structure. This work proposes an alternative: paging GPU rather than CPU memory buffers directly to permanent storage to enable efficient and predictable memory oversubscription. This paper focuses on why GPU paging is useful and how it can be efficiently implemented. Specifically, a GPU paging implementation is proposed as an extension to NVIDIA's embedded Linux GPU drivers. In experiments reported herein, this implementation was seen to be three times faster end-to-end than demand paging, with 81% lower overheads. It also achieved speeds above the fastest prexisting Linux userspace I/O APIs with low DRAM and bus interference to CPU tasks—at most a 17% slowdown.","PeriodicalId":202402,"journal":{"name":"2022 IEEE Real-Time Systems Symposium (RTSS)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enabling GPU Memory Oversubscription via Transparent Paging to an NVMe SSD\",\"authors\":\"Joshua Bakita, James H. Anderson\",\"doi\":\"10.1109/RTSS55097.2022.00039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Safety-critical embedded systems are experiencing increasing computational and memory demands as edge-computing and autonomous systems gain adoption. Main memory (DRAM) is often scarce, and existing mechanisms to support DRAM oversubscription, such as demand paging or compile-time transformations, either imply serious CPU capacity loss, or put unacceptable constraints on program structure. This work proposes an alternative: paging GPU rather than CPU memory buffers directly to permanent storage to enable efficient and predictable memory oversubscription. This paper focuses on why GPU paging is useful and how it can be efficiently implemented. Specifically, a GPU paging implementation is proposed as an extension to NVIDIA's embedded Linux GPU drivers. In experiments reported herein, this implementation was seen to be three times faster end-to-end than demand paging, with 81% lower overheads. It also achieved speeds above the fastest prexisting Linux userspace I/O APIs with low DRAM and bus interference to CPU tasks—at most a 17% slowdown.\",\"PeriodicalId\":202402,\"journal\":{\"name\":\"2022 IEEE Real-Time Systems Symposium (RTSS)\",\"volume\":\"125 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Real-Time Systems Symposium (RTSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTSS55097.2022.00039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Real-Time Systems Symposium (RTSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTSS55097.2022.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enabling GPU Memory Oversubscription via Transparent Paging to an NVMe SSD
Safety-critical embedded systems are experiencing increasing computational and memory demands as edge-computing and autonomous systems gain adoption. Main memory (DRAM) is often scarce, and existing mechanisms to support DRAM oversubscription, such as demand paging or compile-time transformations, either imply serious CPU capacity loss, or put unacceptable constraints on program structure. This work proposes an alternative: paging GPU rather than CPU memory buffers directly to permanent storage to enable efficient and predictable memory oversubscription. This paper focuses on why GPU paging is useful and how it can be efficiently implemented. Specifically, a GPU paging implementation is proposed as an extension to NVIDIA's embedded Linux GPU drivers. In experiments reported herein, this implementation was seen to be three times faster end-to-end than demand paging, with 81% lower overheads. It also achieved speeds above the fastest prexisting Linux userspace I/O APIs with low DRAM and bus interference to CPU tasks—at most a 17% slowdown.