{"title":"Analysis of asymmetric 3D DRAM architecture in combination with L2 cache size reduction","authors":"A. Schönberger, K. Hofmann","doi":"10.1109/HPCSim.2015.7237030","DOIUrl":null,"url":null,"abstract":"Memory is a heterogeneous complex in modern systems. Access time and bandwidth improvement of DRAM using die-stacking technology can only be evaluated by interacting with hardware components like underlying cache, CPU and software components like executed application and processed input. In this work we analyze encoding and decoding processes of JPEG2000 algorithm execution on MIPS I core for different picture sizes. Thereby we can observe that for picture sizes below particular critical value the DRAM share of execution time reaches max. 4%. Any DRAM improvement for this case would not lead to significant performance gain of whole system. Starting with particular picture size depending on last-level cache size the acceleration effect of cache falls off and DRAM influence rises up to 25% and remains for larger pictures. System-level estimation shows that our suggested 3D DRAM architecture can reduce that rise down to a third and is partially able to adopt cache functionality.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"260 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2015.7237030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Memory is a heterogeneous complex in modern systems. Access time and bandwidth improvement of DRAM using die-stacking technology can only be evaluated by interacting with hardware components like underlying cache, CPU and software components like executed application and processed input. In this work we analyze encoding and decoding processes of JPEG2000 algorithm execution on MIPS I core for different picture sizes. Thereby we can observe that for picture sizes below particular critical value the DRAM share of execution time reaches max. 4%. Any DRAM improvement for this case would not lead to significant performance gain of whole system. Starting with particular picture size depending on last-level cache size the acceleration effect of cache falls off and DRAM influence rises up to 25% and remains for larger pictures. System-level estimation shows that our suggested 3D DRAM architecture can reduce that rise down to a third and is partially able to adopt cache functionality.