A low cost, multithreaded processing-in-memory system

Workshop on Memory Performance Issues Pub Date : 2004-06-20 DOI:10.1145/1054943.1054946

J. Brockman, Shyamkumar Thoziyoor, Shannon K. Kuntz, P. Kogge

引用次数: 39

Abstract

This paper discusses die cost vs. performance tradeoffs for a PIM system that could serve as the memory system of a host processor. For an increase of less than twice the cost of a commodity DRAM part, it is possible to realize a performance speedup of nearly a factor of 4 on irregular applications. This cost efficiency derives from developing a custom multithreaded processor architecture and implementation style that is well-suited for embedding in a memory. Specifically, it takes advantage of the low latency and high row bandwidth to both simplify processor design --- reducing area --- as well as to improve processing throughput. To support our claims of cost and performance, we have used simulation, analysis of existing chips, and also designed and fully implemented a prototype chip, PIM Lite.

查看原文本刊更多论文

一种低成本、多线程的内存处理系统

本文讨论了可作为主处理器内存系统的PIM系统的芯片成本与性能权衡。增加不到商品DRAM部件成本的两倍，就有可能在不规则应用中实现近4倍的性能加速。这种成本效率源于开发一种定制的多线程处理器架构和实现风格，它非常适合嵌入到内存中。具体来说，它利用低延迟和高行带宽来简化处理器设计-减少面积-以及提高处理吞吐量。为了支持我们对成本和性能的要求，我们对现有芯片进行了仿真，分析，并设计并完全实现了一个原型芯片，PIM Lite。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Memory Performance Issues

自引率

0.00%

发文量