Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)最新文献_第7页

A Fully Onchip Binarized Convolutional Neural Network FPGA Impelmentation with Accurate Inference 具有精确推理的全片上二值化卷积神经网络FPGA实现

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2018-07-23 DOI: 10.1145/3218603.3218615

Li Yang, Zhezhi He, Deliang Fan

{"title":"A Fully Onchip Binarized Convolutional Neural Network FPGA Impelmentation with Accurate Inference","authors":"Li Yang, Zhezhi He, Deliang Fan","doi":"10.1145/3218603.3218615","DOIUrl":"https://doi.org/10.1145/3218603.3218615","url":null,"abstract":"Deep convolutional neural network has taken an important role in machine learning algorithm which has been widely used in computer vision tasks. However, its enormous model size and massive computation cost have became the main obstacle for deployment of such powerful algorithm in low power and resource limited embedded system, such as FPGA. Recent works have shown the binarized neural networks (BNN), utilizing binarized (i.e. +1 and -1) convolution kernel and binary activation function, can significantly reduce the model size and computation complexity, which paves a new road for energy-efficient FPGA implementation. In this work, we first propose a new BNN algorithm, called Parallel-Convolution BNN (i.e. PC-BNN), which replaces the original binary convolution layer in conventional BNN with two parallel binary convolution layers. PC-BNN achieves ~86% on CIFAR-10 dataset with only 2.3Mb parameter size. We then deploy our proposed PC-BNN into the Xilinx PYNQ Z1 FPGA board with only 4.9Mb on-chip RAM. Since the ultra-small network parameter, it is feasible to store the whole network parameter into on-chip RAM, which could greatly reduce the energy and delay overhead to load network parameter from off-chip memory. Meanwhile, a new data streaming pipeline architecture is proposed in PC-BNN FPGA implementation to further improve throughput. The experiment results show that our PC-BNN based FPGA implementation achieves 930 frames per second, 387.5 FPS/Watt and 396x10-4 FPS/LUT, which are among the best throughput and energy efficiency compared to most recent works.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90679359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

In-situ Stochastic Training of MTJ Crossbar based Neural Networks MTJ交叉棒神经网络的原位随机训练

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2018-06-24 DOI: 10.1145/3218603.3218616

Ankit Mondal, Ankur Srivastava

{"title":"In-situ Stochastic Training of MTJ Crossbar based Neural Networks","authors":"Ankit Mondal, Ankur Srivastava","doi":"10.1145/3218603.3218616","DOIUrl":"https://doi.org/10.1145/3218603.3218616","url":null,"abstract":"Owing to high device density, scalability and non-volatility, Magnetic Tunnel Junction-based crossbars have garnered significant interest for implementing the weights of an artificial neural network. The existence of only two stable states in MTJs implies a high overhead of obtaining optimal binary weights in software. We illustrate that the inherent parallelism in the crossbar structure makes it highly appropriate for in-situ training, wherein the network is taught directly on the hardware. It leads to significantly smaller training overhead as the training time is independent of the size of the network, while also circumventing the effects of alternate current paths in the crossbar and accounting for manufacturing variations in the device. We show how the stochastic switching characteristics of MTJs can be leveraged to perform probabilistic weight updates using the gradient descent algorithm. We describe how the update operations can be performed on crossbars both with and without access transistors and perform simulations on them to demonstrate the effectiveness of our techniques. The results reveal that stochastically trained MTJ-crossbar NNs achieve a classification accuracy nearly same as that of real-valued-weight networks trained in software and exhibit immunity to device variations.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72772373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications 在机器学习应用中部署自定义数据表示和近似计算

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2018-06-03 DOI: 10.1145/3218603.3218612

M. Nazemi, Massoud Pedram

{"title":"Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications","authors":"M. Nazemi, Massoud Pedram","doi":"10.1145/3218603.3218612","DOIUrl":"https://doi.org/10.1145/3218603.3218612","url":null,"abstract":"Major advancements in building general-purpose and customized hardware have been one of the key enablers of versatility and pervasiveness of machine learning models such as deep neural networks. To sustain this ubiquitous deployment of machine learning models and cope with their computational and storage complexity, several solutions such as low-precision representation of model parameters using fixed-point representation and deploying approximate arithmetic operations have been employed. Studying the potency of such solutions in different applications requires integrating them into existing machine learning frameworks for high-level simulations as well as implementing them in hardware to analyze their effects on power/energy dissipation, throughput, and chip area. Lop is a library for design space exploration that bridges the gap between machine learning and efficient hardware realization. It comprises a Python module, which can be integrated with some of the existing machine learning frameworks and implements various customizable data representations including fixed-point and floating-point as well as approximate arithmetic operations. Furthermore, it includes a highly-parameterized Scala module, which allows synthesizing hardware based on the said data representations and arithmetic operations. Lop allows researchers and designers to quickly compare quality of their models using various data representations and arithmetic operations in Python and contrast the hardware cost of viable representations by synthesizing them on their target platforms (e.g., FPGA or ASIC). To the best of our knowledge, Lop is the first library that allows both software simulation and hardware realization using customized data representations and approximate computing techniques.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89960323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference 面向硬件的神经网络近似推理训练

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2018-05-21 DOI: 10.1145/3218603.3218643

Xin He, Liu Ke, Wenyan Lu, Guihai Yan, Xuan Zhang

引用次数: 30

Keynote: Peering into the post Moore's Law world 主题演讲:展望后摩尔定律时代

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009150

T. Austin

引用次数: 0

Keynote: Architecture and software for emerging low-power systems 主题演讲:新兴低功耗系统的架构和软件

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009151

Wen-mei W. Hwu

{"title":"Keynote: Architecture and software for emerging low-power systems","authors":"Wen-mei W. Hwu","doi":"10.1109/ISLPED.2017.8009151","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009151","url":null,"abstract":"We have been experiencing two very important developments in computing. On the one hand, a tremendous amount of resources have been invested into innovative applications such as first-principle based models, deep learning and cognitive computing. On the other hand, the industry has been taking a technological path where application performance and power efficiency vary by more than two orders of magnitude depending on their parallelism, heterogeneity, and locality. We envision a “perfect storm” is coming for future computing resulting from the fact that data movement has become the dominating factor for both power and performance of high-valued applications. It will be critical to match the compute throughput to the data access bandwidth and to locate the compute at where the data is. Much has been and continuously needs to be learned about of algorithms, languages, compilers and hardware architecture in this movement. What are the killer applications that may become the new diver for future technology development? How hard is it to program existing systems to address the date movement issues today? How will we program these systems in the future? How will innovations in memory devices present further opportunities and challenges in designing new systems? What is the impact on long-term software engineering cost on applications (and legacy applications in particular)? In this talk, I will present some lessons learned as we design the IBM-Illinois C3SR Erudite system inside this perfect storm.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"234 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77640367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Keynote: A new Silicon Age 4.0: Generating semiconductor-intelligence paradigm with a Virtual Moore's Law Economics and Heterogeneous technologies 主题演讲:新硅时代4.0:用虚拟摩尔定律经济学和异构技术生成半导体智能范式

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2017-07-01 DOI: 10.1109/ISLPED.2017.8009149

Nicky Liu

{"title":"Keynote: A new Silicon Age 4.0: Generating semiconductor-intelligence paradigm with a Virtual Moore's Law Economics and Heterogeneous technologies","authors":"Nicky Liu","doi":"10.1109/ISLPED.2017.8009149","DOIUrl":"https://doi.org/10.1109/ISLPED.2017.8009149","url":null,"abstract":"The future of the silicon-based economy will not be as pessimistic as some commentators have argued, given their predictions of the end of Moore's Law Economy (ME) by the early 2020s. On the contrary, a Virtual Moore's Law Economy (VME) will develop and thrive, advancing innovation by a new Silicon Way of producing various application-driven Heterogeneous Integrated (HI) Nano-systems by optimization of physics, materials, devices, circuits/chips, software and systems to enable exciting applications for business growth. The semiconductor industry will enjoy sufficient financial returns from new application and system-product sales, even considering more expensive silicon investment. Such a technological approach based on a (Function × Value)-Scaling Down-Plus-Up Methodology, in addition to Linear-Scaling, Area-Scaling and Volumetric-Scaling Methodologies, can fundamentally change the way of thinking and execution toward optimizing coherently both technology definition and final system design with an holistic HIDAS (HI Design/Architecture/System) method. This will drive IC scaling to an effective 1-Nanometer Realm, stimulating a thriving silicon industry which can have at least 30 more years of growth toward a 1 trillion-dollar size.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"18 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82559147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Message from the program co-chairs 来自项目联合主席的信息

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2017-01-01 DOI: 10.1109/ISLPED.2017.8009141

J. Kulkarni, T. Wenisch

引用次数: 0

Let's get physical: Adding physical dimensions to cyber systems 让我们把物理维度添加到网络系统中

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2015-07-22 DOI: 10.1109/ISLPED.2015.7273478

A. S. Vincentelli

{"title":"Let's get physical: Adding physical dimensions to cyber systems","authors":"A. S. Vincentelli","doi":"10.1109/ISLPED.2015.7273478","DOIUrl":"https://doi.org/10.1109/ISLPED.2015.7273478","url":null,"abstract":"Technology advances are creating major shifts in the industrial landscape. Traditional sectors such as transportation, medical and avionics, are witnessing fundamental changes in the supply chain and in the content where the interactions between the physical world and the computing world are becoming increasingly tight. Cyber Physical Systems, Systems of Systems, Internet of Things, Industrie 4.0, Swarm Systems and The Fog are all sectors that attract massive attention from the research communities and massive investment from industry. These concepts are tightly intertwined and describe a movement towards a fully interconnected planet where billions of devices interact via a complex mesh of wireless and wired communication infrastructures. The most compelling vision for the future of technology and industry is one where a swarm of devices is connected with the cloud to provide platforms for myriad of new applications. In this new world, new companies will arise and established ones will have to change radically their business model. The increasing sophistication and heterogeneity of these systems requires radical changes in the way sense-and-control platforms are designed to regulate them. In this presentation, I highlight some of the design challenges due to the complexity, heterogeneity and power consumption of CPS. Indeed, low power consumption is an essential requirement for the swarm of devices especially in the domain of wearable devices for healthcare. Coupled with low cost and reliability, power consumption has to be taken into consideration for any CPS deployment.","PeriodicalId":20456,"journal":{"name":"Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07)","volume":"49 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2015-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82244860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Opportunities in system power management for high performance mixed signal platforms 高性能混合信号平台的系统电源管理机会

Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07) Pub Date : 2015-07-22 DOI: 10.1109/ISLPED.2015.7273479

Jose Pineda de Jyvez

引用次数: 0