一种有效支持人工神经网络的智能存储器架构

K. Großpietsch, J. Büddefeld
{"title":"一种有效支持人工神经网络的智能存储器架构","authors":"K. Großpietsch, J. Büddefeld","doi":"10.1109/EMPDP.2001.905074","DOIUrl":null,"url":null,"abstract":"A \"smart memory\" approach is presented, i.e. the new architecture is achieved by extending the functionality of a conventional RAM structure. The architecture additionally contains two innovative features: To every word cell of w bits, a small q bits wide ALU is associated; and by means of extending the memory decoder, multiple access to certain sets of word cells within the memory as well as activation of their ALUs is possible. It is shown that based on these features, the standard numerical problem of adding up the m components of a vector of dimension m, in the new architecture can be carried out in a time complexity of O(square root(m)). For the execution of artificial neural nets, especially the on-line recognition of patterns mainly depends on the time-efficient efficient execution of weighted sums. It is shown that in our architecture, these weighted sums can be computed quite efficiently. The computation time is highly superior to the time complexity on sequential von Neumann machines. In addition, we show that if requested, the training mode of a neural net can also be significantly be speeded up. This is achieved by means of a simple crossbar switch which can be modularly added to the array of memory chips.","PeriodicalId":262971,"journal":{"name":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","volume":"269 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A smart memory architecture for the efficient support of artificial neural nets\",\"authors\":\"K. Großpietsch, J. Büddefeld\",\"doi\":\"10.1109/EMPDP.2001.905074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A \\\"smart memory\\\" approach is presented, i.e. the new architecture is achieved by extending the functionality of a conventional RAM structure. The architecture additionally contains two innovative features: To every word cell of w bits, a small q bits wide ALU is associated; and by means of extending the memory decoder, multiple access to certain sets of word cells within the memory as well as activation of their ALUs is possible. It is shown that based on these features, the standard numerical problem of adding up the m components of a vector of dimension m, in the new architecture can be carried out in a time complexity of O(square root(m)). For the execution of artificial neural nets, especially the on-line recognition of patterns mainly depends on the time-efficient efficient execution of weighted sums. It is shown that in our architecture, these weighted sums can be computed quite efficiently. The computation time is highly superior to the time complexity on sequential von Neumann machines. In addition, we show that if requested, the training mode of a neural net can also be significantly be speeded up. This is achieved by means of a simple crossbar switch which can be modularly added to the array of memory chips.\",\"PeriodicalId\":262971,\"journal\":{\"name\":\"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing\",\"volume\":\"269 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EMPDP.2001.905074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EMPDP.2001.905074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

提出了一种“智能存储器”方法,即通过扩展传统RAM结构的功能来实现新架构。该架构还包含两个创新特性:对于每个w位的字单元,关联一个小的q位宽的ALU;通过扩展记忆解码器,可以多次访问记忆中的某些词单元集并激活它们的alu。结果表明,基于这些特征,在新的体系结构中,将m维向量的m个分量相加的标准数值问题可以在0(平方根(m))的时间复杂度内完成。对于人工神经网络的执行,特别是模式的在线识别,主要依赖于时间效率的加权和的高效执行。结果表明,在我们的体系结构中,这些加权和可以相当有效地计算出来。计算时间大大优于顺序冯·诺依曼机器的时间复杂度。此外,我们还表明,如果有要求,神经网络的训练模式也可以显著加快。这是通过一个简单的交叉开关来实现的,该开关可以模块化地添加到存储芯片阵列中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A smart memory architecture for the efficient support of artificial neural nets
A "smart memory" approach is presented, i.e. the new architecture is achieved by extending the functionality of a conventional RAM structure. The architecture additionally contains two innovative features: To every word cell of w bits, a small q bits wide ALU is associated; and by means of extending the memory decoder, multiple access to certain sets of word cells within the memory as well as activation of their ALUs is possible. It is shown that based on these features, the standard numerical problem of adding up the m components of a vector of dimension m, in the new architecture can be carried out in a time complexity of O(square root(m)). For the execution of artificial neural nets, especially the on-line recognition of patterns mainly depends on the time-efficient efficient execution of weighted sums. It is shown that in our architecture, these weighted sums can be computed quite efficiently. The computation time is highly superior to the time complexity on sequential von Neumann machines. In addition, we show that if requested, the training mode of a neural net can also be significantly be speeded up. This is achieved by means of a simple crossbar switch which can be modularly added to the array of memory chips.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信