K. Mahmoud, W. E. Smith, Mark Fishkin, Timothy N. Miller
{"title":"Data-driven logic synthesizer for acceleration of Forward propagation in artificial neural networks","authors":"K. Mahmoud, W. E. Smith, Mark Fishkin, Timothy N. Miller","doi":"10.1109/ICCD.2015.7357142","DOIUrl":null,"url":null,"abstract":"We present a tool for automatically generating efficient feed-forward logic for hardware acceleration of artificial neural networks (ANNs). It produces circuitry in the form of synthesizable Verilog code that is optimized based on analyzing training data to minimize the numbers of bits in weights and values, thereby minimizing the number of logic gates in ANN components such as adders and multipliers. For an optimized ANN, different implementation topologies can be generated, including fully pipelined and simple state machines. Additional insights about hardware acceleration for neural networks are also presented. We show the impact of reducing precision relative to floating point and present area, power, delay, throughput, and energy estimates by circuit synthesis.","PeriodicalId":129506,"journal":{"name":"2015 33rd IEEE International Conference on Computer Design (ICCD)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 33rd IEEE International Conference on Computer Design (ICCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCD.2015.7357142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present a tool for automatically generating efficient feed-forward logic for hardware acceleration of artificial neural networks (ANNs). It produces circuitry in the form of synthesizable Verilog code that is optimized based on analyzing training data to minimize the numbers of bits in weights and values, thereby minimizing the number of logic gates in ANN components such as adders and multipliers. For an optimized ANN, different implementation topologies can be generated, including fully pipelined and simple state machines. Additional insights about hardware acceleration for neural networks are also presented. We show the impact of reducing precision relative to floating point and present area, power, delay, throughput, and energy estimates by circuit synthesis.