2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)最新文献

筛选
英文 中文
Distributed Neural Networks using TensorFlow over Multicore and Many-Core Systems 在多核和多核系统上使用TensorFlow的分布式神经网络
Jagadish Kumar Ranbirsingh, Hanke Kimm, H. Kimm
{"title":"Distributed Neural Networks using TensorFlow over Multicore and Many-Core Systems","authors":"Jagadish Kumar Ranbirsingh, Hanke Kimm, H. Kimm","doi":"10.1109/MCSoC.2019.00022","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00022","url":null,"abstract":"This paper focuses on distributed deep learning models that simulate the HAR (Human Activity Recognition) data set from the UCI machine learning Repository. The proposed deep learning LSTM (Long Short-Term Memory) model works with the TensorFlow framework using the Python 3 programming language which supports the distributed architecture. In order to simulate the distributed deep learning models over different multicore and many-core systems, two hardware platforms are built; the first one is equipped with a Raspberry Pi cluster with 16 Pi 3 model B+ boards which each having 1 GB of RAM and 32 GB flash storage. The second platform is houses an Octa-core Intel Xeon CPU system with a 16MB Cache, 32 GB RAM and 2 TB SSD primary storage with 10 TB HDD secondary storage. In this paper, the performance of the distributed LSTM model over multicore and many-core systems is presented in terms of execution speed and efficiency of prediction accuracy upon varying number of deep layers with corresponding hidden nodes. In this experiment, a 3 x 3 distributed LSTM model has been used, which furnishes higher prediction accuracy with faster computation time than the models that different number of layers provide.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131857909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Deep Learning Framework with Arbitrary Numerical Precision 具有任意数值精度的深度学习框架
M. Kiyama, M. Amagasaki, M. Iida
{"title":"Deep Learning Framework with Arbitrary Numerical Precision","authors":"M. Kiyama, M. Amagasaki, M. Iida","doi":"10.1109/MCSoC.2019.00019","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00019","url":null,"abstract":"Deep neural networks (DNNs) have recently shown outstanding performance in solving problems in many domains. However, it is difficult to run such applications on mobile devices due to limited hardware resources. Quantization is one method to reduce the hardware requirements. By default, 32-bit floating-point numbers are used in DNNs, while quantization uses fewer bits, such as 4-bit fixed points, at the cost of precision. Previous research has explored two problems related to this: (1) differences between software emulation and implementation that affect model accuracy and (2) lowered accuracy during normalization. In this paper, we developed a new DNNs framework, PyParch, that allows easy manipulation of quantization and propose a training method for fitting to a hardware-friendly model. We show that our developed tool can solve the two problems mentioned above. Quantized models described in previous methods need 18 bits in order to recover the original accuracy, whereas our method requires only 14 bits.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130435233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design of Asynchronous CNN Circuits on Commercial FPGA from Synchronous CNN Circuits 基于同步CNN电路的商用FPGA异步CNN电路设计
Hayato Kato, H. Saito
{"title":"Design of Asynchronous CNN Circuits on Commercial FPGA from Synchronous CNN Circuits","authors":"Hayato Kato, H. Saito","doi":"10.1109/MCSoC.2019.00016","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00016","url":null,"abstract":"To accelerate performance, Convolutional Neural Networks (CNNs) are frequently used in Field Programmable Gate Arrays (FPGAs). In this paper, to reduce the power consumption of CNN circuits, we propose a design method to design asynchronous CNN circuits on commercial FPGAs. First, the proposed method converts Register Transfer Level (RTL) models of synchronous CNN circuits to RTL models of asynchronous CNN circuits. Then, the proposed method designs asynchronous CNN circuits using a commercial FPGA design environment. In the experiment, we designed an asynchronous CNN circuit and evaluated the performance. Compared to the synchronous counterpart, the asynchronous CNN circuit consumed about 2.3% less energy.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121755394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Energy and Performance Analysis of STTRAM Caches for Mobile Applications 用于移动应用程序的stram缓存的能量和性能分析
Kyle Kuan, Tosiron Adegbija
{"title":"Energy and Performance Analysis of STTRAM Caches for Mobile Applications","authors":"Kyle Kuan, Tosiron Adegbija","doi":"10.1109/MCSoC.2019.00044","DOIUrl":"https://doi.org/10.1109/MCSoC.2019.00044","url":null,"abstract":"Spin-Transfer Torque RAMs (STTRAMs) have been shown to offer much promise for implementing emerging cache architectures. This paper studies the viability of STTRAM caches for mobile workloads from the perspective of energy and latency. Specifically, we explore the benefits of reduced retention STTRAM caches for mobile applications. We analyze the characteristics of mobile applications' cache blocks and how those characteristics dictate the appropriate retention time for mobile device caches. We show that due to their inherently interactive nature, mobile applications' execution characteristics—and hence, STTRAM cache design requirements—differ from other kinds of applications. We also explore various STTRAM cache designs in both single and multicore systems, and at different cache levels, that can efficiently satisfy mobile applications' execution requirements, in order to maximize energy savings without introducing substantial latency overhead.","PeriodicalId":104240,"journal":{"name":"2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"88 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126307798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信