近内存处理的算法/体系结构协同设计

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2018-08-28 DOI:10.1145/3273982.3273992

M. Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, Javier Picorel, B. Falsafi, Boris Grot, D. Pnevmatikatos

{"title":"近内存处理的算法/体系结构协同设计","authors":"M. Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, Javier Picorel, B. Falsafi, Boris Grot, D. Pnevmatikatos","doi":"10.1145/3273982.3273992","DOIUrl":null,"url":null,"abstract":"With mainstream technologies to couple logic tightly with memory on the horizon, near-memory processing has re-emerged as a promising approach to improving performance and energy for data-centric computing. DRAM, however, is primarily designed for density and low cost, with a rigid internal organization that favors coarse-grain streaming rather than byte-level random access. This paper makes the case that treating DRAM as a block-oriented streaming device yields significant efficiency and performance benefits, which motivate for algorithm/architecture co-design to favor streaming access patterns, even at the price of a higher order algorithmic complexity. We present the Mondrian Data Engine that drastically improves the runtime and energy efficiency of basic in-memory analytic operators, despite doing more work as compared to traditional CPU-optimized algorithms, which heavily rely on random accesses and deep cache hierarchies","PeriodicalId":7046,"journal":{"name":"ACM SIGOPS Oper. Syst. Rev.","volume":"12 1","pages":"109-122"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Algorithm/Architecture Co-Design for Near-Memory Processing\",\"authors\":\"M. Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, Javier Picorel, B. Falsafi, Boris Grot, D. Pnevmatikatos\",\"doi\":\"10.1145/3273982.3273992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With mainstream technologies to couple logic tightly with memory on the horizon, near-memory processing has re-emerged as a promising approach to improving performance and energy for data-centric computing. DRAM, however, is primarily designed for density and low cost, with a rigid internal organization that favors coarse-grain streaming rather than byte-level random access. This paper makes the case that treating DRAM as a block-oriented streaming device yields significant efficiency and performance benefits, which motivate for algorithm/architecture co-design to favor streaming access patterns, even at the price of a higher order algorithmic complexity. We present the Mondrian Data Engine that drastically improves the runtime and energy efficiency of basic in-memory analytic operators, despite doing more work as compared to traditional CPU-optimized algorithms, which heavily rely on random accesses and deep cache hierarchies\",\"PeriodicalId\":7046,\"journal\":{\"name\":\"ACM SIGOPS Oper. Syst. Rev.\",\"volume\":\"12 1\",\"pages\":\"109-122\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM SIGOPS Oper. Syst. Rev.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3273982.3273992\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGOPS Oper. Syst. Rev.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3273982.3273992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

随着主流技术将逻辑与内存紧密结合在一起，近内存处理已经重新成为一种有前途的方法，可以提高以数据为中心的计算的性能和能源。然而，DRAM主要是为密度和低成本而设计的，具有严格的内部组织，支持粗粒度流而不是字节级随机访问。本文认为，将DRAM作为面向块的流设备可以产生显著的效率和性能优势，这促使算法/架构协同设计倾向于流访问模式，即使以更高阶算法复杂性为代价。我们提出了Mondrian数据引擎，它大大提高了基本内存分析运算符的运行时间和能源效率，尽管与传统的cpu优化算法相比，它做了更多的工作，这些算法严重依赖于随机访问和深度缓存层次结构

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Algorithm/Architecture Co-Design for Near-Memory Processing

With mainstream technologies to couple logic tightly with memory on the horizon, near-memory processing has re-emerged as a promising approach to improving performance and energy for data-centric computing. DRAM, however, is primarily designed for density and low cost, with a rigid internal organization that favors coarse-grain streaming rather than byte-level random access. This paper makes the case that treating DRAM as a block-oriented streaming device yields significant efficiency and performance benefits, which motivate for algorithm/architecture co-design to favor streaming access patterns, even at the price of a higher order algorithmic complexity. We present the Mondrian Data Engine that drastically improves the runtime and energy efficiency of basic in-memory analytic operators, despite doing more work as compared to traditional CPU-optimized algorithms, which heavily rely on random accesses and deep cache hierarchies

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM SIGOPS Oper. Syst. Rev.

自引率

0.00%

发文量