对人工神经网络的内存处理器支持

J. Schabel, Lee Baker, Sumon Dey, Weifu Li, P. Franzon
{"title":"对人工神经网络的内存处理器支持","authors":"J. Schabel, Lee Baker, Sumon Dey, Weifu Li, P. Franzon","doi":"10.1109/ICRC.2016.7738697","DOIUrl":null,"url":null,"abstract":"Hardware acceleration of artificial neural network (ANN) processing has potential for supporting applications benefiting from real time and low power operation, such as autonomous vehicles, robotics, recognition and data mining. Most interest in ANNs targets acceleration of deep multi-layered ANNs that can require days of offline training to converge on a desired network behavior. Interest has grown in ANNs capable of supporting unsupervised training, where networks can learn new information from unlabeled data dynamically without the need for offline training. These ANNs require large memories with bandwidths much higher than supported in modern GPGPUs. Custom hardware acceleration and memory co-design holds the potential to provide real-time performance in cases where the performance requirements cannot be met by modern GPGPUs. This work presents a custom processor solution to accelerate two hetero-associative memories (Sparsey and HTM) capable of unsupervised and one-hot learning. This custom processor is implemented as an expandable ASIP built upon a configurable SIMD engine for exploiting parallelism. Functional specialization is implemented utilizing processor-in-memory techniques, which results in up to a 20× speedup and a 2000× reduction in energy per frame compared to a software implementation operating on a dataset for recognition of human actions.","PeriodicalId":387008,"journal":{"name":"2016 IEEE International Conference on Rebooting Computing (ICRC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Processor-in-memory support for artificial neural networks\",\"authors\":\"J. Schabel, Lee Baker, Sumon Dey, Weifu Li, P. Franzon\",\"doi\":\"10.1109/ICRC.2016.7738697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hardware acceleration of artificial neural network (ANN) processing has potential for supporting applications benefiting from real time and low power operation, such as autonomous vehicles, robotics, recognition and data mining. Most interest in ANNs targets acceleration of deep multi-layered ANNs that can require days of offline training to converge on a desired network behavior. Interest has grown in ANNs capable of supporting unsupervised training, where networks can learn new information from unlabeled data dynamically without the need for offline training. These ANNs require large memories with bandwidths much higher than supported in modern GPGPUs. Custom hardware acceleration and memory co-design holds the potential to provide real-time performance in cases where the performance requirements cannot be met by modern GPGPUs. This work presents a custom processor solution to accelerate two hetero-associative memories (Sparsey and HTM) capable of unsupervised and one-hot learning. This custom processor is implemented as an expandable ASIP built upon a configurable SIMD engine for exploiting parallelism. Functional specialization is implemented utilizing processor-in-memory techniques, which results in up to a 20× speedup and a 2000× reduction in energy per frame compared to a software implementation operating on a dataset for recognition of human actions.\",\"PeriodicalId\":387008,\"journal\":{\"name\":\"2016 IEEE International Conference on Rebooting Computing (ICRC)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Rebooting Computing (ICRC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRC.2016.7738697\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Rebooting Computing (ICRC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRC.2016.7738697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

人工神经网络(ANN)处理的硬件加速具有支持实时和低功耗操作的应用的潜力,例如自动驾驶汽车、机器人、识别和数据挖掘。对人工神经网络最感兴趣的是深度多层人工神经网络的加速,这些人工神经网络可能需要数天的离线训练才能收敛到期望的网络行为。人们对支持无监督训练的人工神经网络越来越感兴趣,在这种情况下,网络可以动态地从未标记的数据中学习新信息,而不需要离线训练。这些人工神经网络需要比现代gpgpu支持的带宽高得多的大内存。在现代gpgpu无法满足性能要求的情况下,定制硬件加速和内存协同设计具有提供实时性能的潜力。本研究提出了一种自定义处理器解决方案,以加速两个具有无监督和单热学习能力的异联想存储器(Sparsey和HTM)。这个定制处理器是作为一个可扩展的ASIP实现的,该ASIP构建在一个可配置的SIMD引擎上,以利用并行性。功能专门化是利用内存中的处理器技术实现的,与在识别人类行为的数据集上操作的软件实现相比,它的速度提高了20倍,每帧能量减少了2000倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Processor-in-memory support for artificial neural networks
Hardware acceleration of artificial neural network (ANN) processing has potential for supporting applications benefiting from real time and low power operation, such as autonomous vehicles, robotics, recognition and data mining. Most interest in ANNs targets acceleration of deep multi-layered ANNs that can require days of offline training to converge on a desired network behavior. Interest has grown in ANNs capable of supporting unsupervised training, where networks can learn new information from unlabeled data dynamically without the need for offline training. These ANNs require large memories with bandwidths much higher than supported in modern GPGPUs. Custom hardware acceleration and memory co-design holds the potential to provide real-time performance in cases where the performance requirements cannot be met by modern GPGPUs. This work presents a custom processor solution to accelerate two hetero-associative memories (Sparsey and HTM) capable of unsupervised and one-hot learning. This custom processor is implemented as an expandable ASIP built upon a configurable SIMD engine for exploiting parallelism. Functional specialization is implemented utilizing processor-in-memory techniques, which results in up to a 20× speedup and a 2000× reduction in energy per frame compared to a software implementation operating on a dataset for recognition of human actions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信