Dongki Kim, Sungkwang Lee, Jaewoong Chung, Daehyun Kim, Dong Hyuk Woo, S. Yoo, Sunggu Lee
{"title":"Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU","authors":"Dongki Kim, Sungkwang Lee, Jaewoong Chung, Daehyun Kim, Dong Hyuk Woo, S. Yoo, Sunggu Lee","doi":"10.1145/2228360.2228519","DOIUrl":null,"url":null,"abstract":"Single-chip CPU/GPU architecture is being adopted in high-end (embedded) systems, e.g., smartphones and tablet PCs. Main memory subsystem is expected to consist of hybrid DRAM and phase-change RAM (PRAM) due to the difficulties in DRAM scaling. In this work, we address the performance optimization of the hybrid DRAM/PRAM main memory for single chip CPU/GPU. Based on the tight requirements of low latency from CPU and the relative tolerance to long latency from GPU, DRAM is first allocated to CPU while PRAM with longer write latency is allocated to GPU. Then, in order to improve the write performance of GPU traffic, we propose (1) an in-DRAM write buffer to accommodate GPU write traffics, (2) dynamic hot data management to improve the efficiency of write buffer, (3) runtime-adaptive adjustment of write buffer size to meet the given CPU performance bound, and (4) CPU-aware DRAM access scheduling to give low latency to CPU traffics. The experiments show that the proposed method gives 1.02~44.2 times performance improvement in GPU performance with modest (negligible) CPU performance overhead (when compute-intensive CPU programs run).","PeriodicalId":263599,"journal":{"name":"DAC Design Automation Conference 2012","volume":"748 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DAC Design Automation Conference 2012","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2228360.2228519","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
Abstract
Single-chip CPU/GPU architecture is being adopted in high-end (embedded) systems, e.g., smartphones and tablet PCs. Main memory subsystem is expected to consist of hybrid DRAM and phase-change RAM (PRAM) due to the difficulties in DRAM scaling. In this work, we address the performance optimization of the hybrid DRAM/PRAM main memory for single chip CPU/GPU. Based on the tight requirements of low latency from CPU and the relative tolerance to long latency from GPU, DRAM is first allocated to CPU while PRAM with longer write latency is allocated to GPU. Then, in order to improve the write performance of GPU traffic, we propose (1) an in-DRAM write buffer to accommodate GPU write traffics, (2) dynamic hot data management to improve the efficiency of write buffer, (3) runtime-adaptive adjustment of write buffer size to meet the given CPU performance bound, and (4) CPU-aware DRAM access scheduling to give low latency to CPU traffics. The experiments show that the proposed method gives 1.02~44.2 times performance improvement in GPU performance with modest (negligible) CPU performance overhead (when compute-intensive CPU programs run).