Algorithm/Architecture Co-Design for Near-Memory Processing

ACM SIGOPS Oper. Syst. Rev. Pub Date : 2018-08-28 DOI:10.1145/3273982.3273992

M. Drumond, Alexandros Daglis, Nooshin Mirzadeh, Dmitrii Ustiugov, Javier Picorel, B. Falsafi, Boris Grot, D. Pnevmatikatos

引用次数: 4

Abstract

With mainstream technologies to couple logic tightly with memory on the horizon, near-memory processing has re-emerged as a promising approach to improving performance and energy for data-centric computing. DRAM, however, is primarily designed for density and low cost, with a rigid internal organization that favors coarse-grain streaming rather than byte-level random access. This paper makes the case that treating DRAM as a block-oriented streaming device yields significant efficiency and performance benefits, which motivate for algorithm/architecture co-design to favor streaming access patterns, even at the price of a higher order algorithmic complexity. We present the Mondrian Data Engine that drastically improves the runtime and energy efficiency of basic in-memory analytic operators, despite doing more work as compared to traditional CPU-optimized algorithms, which heavily rely on random accesses and deep cache hierarchies

查看原文本刊更多论文

近内存处理的算法/体系结构协同设计

随着主流技术将逻辑与内存紧密结合在一起，近内存处理已经重新成为一种有前途的方法，可以提高以数据为中心的计算的性能和能源。然而，DRAM主要是为密度和低成本而设计的，具有严格的内部组织，支持粗粒度流而不是字节级随机访问。本文认为，将DRAM作为面向块的流设备可以产生显著的效率和性能优势，这促使算法/架构协同设计倾向于流访问模式，即使以更高阶算法复杂性为代价。我们提出了Mondrian数据引擎，它大大提高了基本内存分析运算符的运行时间和能源效率，尽管与传统的cpu优化算法相比，它做了更多的工作，这些算法严重依赖于随机访问和深度缓存层次结构

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM SIGOPS Oper. Syst. Rev.

自引率

0.00%

发文量