面向物联网的ΣΔ流异步随机计算的错误-延迟权衡

Patricia Gonzalez-Guerrero, S. G. Wilson, M. Stan
{"title":"面向物联网的ΣΔ流异步随机计算的错误-延迟权衡","authors":"Patricia Gonzalez-Guerrero, S. G. Wilson, M. Stan","doi":"10.1109/SOCC46988.2019.1570548453","DOIUrl":null,"url":null,"abstract":"Asynchronous stochastic computing (ASC) using continuous-time-asynchronous $\\Sigma\\Delta$ modulators $(\\mathrm{S}\\mathrm{C}-\\mathrm{A}\\Sigma\\Delta \\mathrm{M})$ has the potential to enable ultra-low-power, on-node machine learning algorithms for the next generation of sensors for the Internet of Things $(\\mathrm{I}\\mathrm{o}\\mathrm{T})$. Similar to synchronous stochastic computing $(\\mathrm{S}\\mathrm{S}\\mathrm{C}^{\\mathrm{I}})$1, in $\\mathrm{S}\\mathrm{C}-\\mathrm{A}\\Sigma\\Delta \\mathrm{M}$ complex processing units can be implemented with simple gates because numbers are represented with streams. For example a multiplier is implemented with a XNOR gate, yielding savings in power and area of 90% compared with the typical binary approach. Previous work demonstrated that $\\mathrm{S}\\mathrm{C}-\\mathrm{A}\\Sigma\\Delta \\mathrm{M}$ leverages SSC advantages and addresses its drawbacks, achieving significant savings in energy, power and latency. In this work, we study a theoretical model to determine the fundamental limits of accuracy and computing time for SC- $\\mathrm{A}\\Sigma\\Delta \\mathrm{M}$. Since the $\\Sigma\\Delta$ streams are periodic the final computing error is non-zero and depends on the period of the input streams. We validate our theoretical model with Spice-level simulations and evaluate the power and energy consumption using a standard FinFetlX2 technology for two cases: 1) multiplication and 2) gamma correction, an image processing algorithm. Our work determines circuit design guidelines for $\\mathrm{S}\\mathrm{C}-\\mathrm{A}\\Sigma\\Delta \\mathrm{M}$ and shows that multiplication with $\\mathrm{S}\\mathrm{C}-\\mathrm{A}\\Sigma\\Delta \\mathrm{M}$ requires at least 6X less time than SSC. The latency reduction and novel architecture positively impacts the overall energy consumption in the $\\mathrm{I}\\mathrm{o}\\mathrm{T}$ node, enabling savings in energy of 79% compared with the binary approach.1SC is by definition a synchronous approach, thus we use SSC to differentiate it from asynchronous stochastic computing2In modern technologies the node number does not refer to any one feature in the process, and foundries use slightly different conventions; we use lx to denote the 14/16nm FinFET nodes offered by the foundry.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Error-latency Trade-off for Asynchronous Stochastic Computing with ΣΔ Streams for the IoT\",\"authors\":\"Patricia Gonzalez-Guerrero, S. G. Wilson, M. Stan\",\"doi\":\"10.1109/SOCC46988.2019.1570548453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Asynchronous stochastic computing (ASC) using continuous-time-asynchronous $\\\\Sigma\\\\Delta$ modulators $(\\\\mathrm{S}\\\\mathrm{C}-\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M})$ has the potential to enable ultra-low-power, on-node machine learning algorithms for the next generation of sensors for the Internet of Things $(\\\\mathrm{I}\\\\mathrm{o}\\\\mathrm{T})$. Similar to synchronous stochastic computing $(\\\\mathrm{S}\\\\mathrm{S}\\\\mathrm{C}^{\\\\mathrm{I}})$1, in $\\\\mathrm{S}\\\\mathrm{C}-\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M}$ complex processing units can be implemented with simple gates because numbers are represented with streams. For example a multiplier is implemented with a XNOR gate, yielding savings in power and area of 90% compared with the typical binary approach. Previous work demonstrated that $\\\\mathrm{S}\\\\mathrm{C}-\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M}$ leverages SSC advantages and addresses its drawbacks, achieving significant savings in energy, power and latency. In this work, we study a theoretical model to determine the fundamental limits of accuracy and computing time for SC- $\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M}$. Since the $\\\\Sigma\\\\Delta$ streams are periodic the final computing error is non-zero and depends on the period of the input streams. We validate our theoretical model with Spice-level simulations and evaluate the power and energy consumption using a standard FinFetlX2 technology for two cases: 1) multiplication and 2) gamma correction, an image processing algorithm. Our work determines circuit design guidelines for $\\\\mathrm{S}\\\\mathrm{C}-\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M}$ and shows that multiplication with $\\\\mathrm{S}\\\\mathrm{C}-\\\\mathrm{A}\\\\Sigma\\\\Delta \\\\mathrm{M}$ requires at least 6X less time than SSC. The latency reduction and novel architecture positively impacts the overall energy consumption in the $\\\\mathrm{I}\\\\mathrm{o}\\\\mathrm{T}$ node, enabling savings in energy of 79% compared with the binary approach.1SC is by definition a synchronous approach, thus we use SSC to differentiate it from asynchronous stochastic computing2In modern technologies the node number does not refer to any one feature in the process, and foundries use slightly different conventions; we use lx to denote the 14/16nm FinFET nodes offered by the foundry.\",\"PeriodicalId\":253998,\"journal\":{\"name\":\"2019 32nd IEEE International System-on-Chip Conference (SOCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 32nd IEEE International System-on-Chip Conference (SOCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOCC46988.2019.1570548453\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCC46988.2019.1570548453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

使用连续时间异步$\Sigma\Delta$调制器的异步随机计算(ASC) $(\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M})$有潜力为下一代物联网传感器实现超低功耗、节点上机器学习算法$(\mathrm{I}\mathrm{o}\mathrm{T})$。与同步随机计算$(\mathrm{S}\mathrm{S}\mathrm{C}^{\mathrm{I}})$ 1类似,在$\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$中,复杂的处理单元可以用简单的门来实现,因为数字是用流表示的。例如,用XNOR门实现乘法器,可以节省90的功率和面积% compared with the typical binary approach. Previous work demonstrated that $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ leverages SSC advantages and addresses its drawbacks, achieving significant savings in energy, power and latency. In this work, we study a theoretical model to determine the fundamental limits of accuracy and computing time for SC- $\mathrm{A}\Sigma\Delta \mathrm{M}$. Since the $\Sigma\Delta$ streams are periodic the final computing error is non-zero and depends on the period of the input streams. We validate our theoretical model with Spice-level simulations and evaluate the power and energy consumption using a standard FinFetlX2 technology for two cases: 1) multiplication and 2) gamma correction, an image processing algorithm. Our work determines circuit design guidelines for $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ and shows that multiplication with $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ requires at least 6X less time than SSC. The latency reduction and novel architecture positively impacts the overall energy consumption in the $\mathrm{I}\mathrm{o}\mathrm{T}$ node, enabling savings in energy of 79% compared with the binary approach.1SC is by definition a synchronous approach, thus we use SSC to differentiate it from asynchronous stochastic computing2In modern technologies the node number does not refer to any one feature in the process, and foundries use slightly different conventions; we use lx to denote the 14/16nm FinFET nodes offered by the foundry.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Error-latency Trade-off for Asynchronous Stochastic Computing with ΣΔ Streams for the IoT
Asynchronous stochastic computing (ASC) using continuous-time-asynchronous $\Sigma\Delta$ modulators $(\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M})$ has the potential to enable ultra-low-power, on-node machine learning algorithms for the next generation of sensors for the Internet of Things $(\mathrm{I}\mathrm{o}\mathrm{T})$. Similar to synchronous stochastic computing $(\mathrm{S}\mathrm{S}\mathrm{C}^{\mathrm{I}})$1, in $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ complex processing units can be implemented with simple gates because numbers are represented with streams. For example a multiplier is implemented with a XNOR gate, yielding savings in power and area of 90% compared with the typical binary approach. Previous work demonstrated that $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ leverages SSC advantages and addresses its drawbacks, achieving significant savings in energy, power and latency. In this work, we study a theoretical model to determine the fundamental limits of accuracy and computing time for SC- $\mathrm{A}\Sigma\Delta \mathrm{M}$. Since the $\Sigma\Delta$ streams are periodic the final computing error is non-zero and depends on the period of the input streams. We validate our theoretical model with Spice-level simulations and evaluate the power and energy consumption using a standard FinFetlX2 technology for two cases: 1) multiplication and 2) gamma correction, an image processing algorithm. Our work determines circuit design guidelines for $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ and shows that multiplication with $\mathrm{S}\mathrm{C}-\mathrm{A}\Sigma\Delta \mathrm{M}$ requires at least 6X less time than SSC. The latency reduction and novel architecture positively impacts the overall energy consumption in the $\mathrm{I}\mathrm{o}\mathrm{T}$ node, enabling savings in energy of 79% compared with the binary approach.1SC is by definition a synchronous approach, thus we use SSC to differentiate it from asynchronous stochastic computing2In modern technologies the node number does not refer to any one feature in the process, and foundries use slightly different conventions; we use lx to denote the 14/16nm FinFET nodes offered by the foundry.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信