Hao Wang, Jie Zhang, S. Shridhar, Gieseo Park, Myoungsoo Jung, N. Kim
{"title":"DUANG:在非对称内存系统中快速轻量级的页面迁移","authors":"Hao Wang, Jie Zhang, S. Shridhar, Gieseo Park, Myoungsoo Jung, N. Kim","doi":"10.1109/HPCA.2016.7446088","DOIUrl":null,"url":null,"abstract":"Main memory systems have gone through dramatic increases in bandwidth and capacity. At the same time, their random access latency has remained relatively constant. For given memory technology, optimizing the latency typically requires sacrificing the density (i.e., cost per bit), which is one of the most critical concerns for memory industry. Recent studies have proposed memory architectures comprised of asymmetric (fast/low-density and slow/high-density) regions to optimize between overall latency and negative impact on density. Such memory architectures attempt to cost-effectively offer both high capacity and high performance. Yet they present a unique challenge, requiring direct placements of hot memory pages1 in the fast region and/or expensive runtime page migrations. In this paper, we propose a novel resistive memory architecture sharing a set of row buffers between a pair of neighboring banks. It enables two attractive techniques: (1) migrating memory pages between slow and fast banks with little performance overhead and (2) adaptively allocating more row buffers to busier banks based on memory access patterns. For an asymmetric memory architecture with both slow/high-density and fast/low-density banks, our shared row-buffer architecture can capture 87-93% of the performance of a memory architecture with only fast banks.","PeriodicalId":417994,"journal":{"name":"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"DUANG: Fast and lightweight page migration in asymmetric memory systems\",\"authors\":\"Hao Wang, Jie Zhang, S. Shridhar, Gieseo Park, Myoungsoo Jung, N. Kim\",\"doi\":\"10.1109/HPCA.2016.7446088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Main memory systems have gone through dramatic increases in bandwidth and capacity. At the same time, their random access latency has remained relatively constant. For given memory technology, optimizing the latency typically requires sacrificing the density (i.e., cost per bit), which is one of the most critical concerns for memory industry. Recent studies have proposed memory architectures comprised of asymmetric (fast/low-density and slow/high-density) regions to optimize between overall latency and negative impact on density. Such memory architectures attempt to cost-effectively offer both high capacity and high performance. Yet they present a unique challenge, requiring direct placements of hot memory pages1 in the fast region and/or expensive runtime page migrations. In this paper, we propose a novel resistive memory architecture sharing a set of row buffers between a pair of neighboring banks. It enables two attractive techniques: (1) migrating memory pages between slow and fast banks with little performance overhead and (2) adaptively allocating more row buffers to busier banks based on memory access patterns. For an asymmetric memory architecture with both slow/high-density and fast/low-density banks, our shared row-buffer architecture can capture 87-93% of the performance of a memory architecture with only fast banks.\",\"PeriodicalId\":417994,\"journal\":{\"name\":\"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2016.7446088\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2016.7446088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DUANG: Fast and lightweight page migration in asymmetric memory systems
Main memory systems have gone through dramatic increases in bandwidth and capacity. At the same time, their random access latency has remained relatively constant. For given memory technology, optimizing the latency typically requires sacrificing the density (i.e., cost per bit), which is one of the most critical concerns for memory industry. Recent studies have proposed memory architectures comprised of asymmetric (fast/low-density and slow/high-density) regions to optimize between overall latency and negative impact on density. Such memory architectures attempt to cost-effectively offer both high capacity and high performance. Yet they present a unique challenge, requiring direct placements of hot memory pages1 in the fast region and/or expensive runtime page migrations. In this paper, we propose a novel resistive memory architecture sharing a set of row buffers between a pair of neighboring banks. It enables two attractive techniques: (1) migrating memory pages between slow and fast banks with little performance overhead and (2) adaptively allocating more row buffers to busier banks based on memory access patterns. For an asymmetric memory architecture with both slow/high-density and fast/low-density banks, our shared row-buffer architecture can capture 87-93% of the performance of a memory architecture with only fast banks.