{"title":"Parallelization of digit recognition system using Deep Convolutional Neural Network on CUDA","authors":"Srishti Singh, Amrit Paul, M. Arun","doi":"10.1109/SSPS.2017.8071623","DOIUrl":null,"url":null,"abstract":"A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy. A neural network with three layers of convolutions and two fully connected layers is developed by building input, hidden and output neurons to achieve an improved accuracy. The network is parallelized using a dedicated GPU on CUDA platform using Tensor flow library. A comparative analysis of accuracy and computation time is performed for sequential and parallel execution of the network on dual core (4 logical processors) CPU, octa core (16 logical processors) only CPU and octa core (16 logical processors) CPU with GPU systems. MNIST (Modified National Institute of Standards and Technology) and EMNIST (Extended MNIST) database are used for both training and testing. MNIST has 55000 training sets, 10000 testing sets and 5000 validation sets. EMNIST consists of 235000 training, 40000 testing and 5000 validation sets. The network designed requires high computation and hence parallelizing it shows significant improvement in execution time.","PeriodicalId":382353,"journal":{"name":"2017 Third International Conference on Sensing, Signal Processing and Security (ICSSS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Third International Conference on Sensing, Signal Processing and Security (ICSSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSPS.2017.8071623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
A Compute Unified Device Architecture (CUDA) implementation of Deep Convolutional Neural Network (DCNN) for a digit recognition system is proposed to reduce the computation time of ANN and achieve high accuracy. A neural network with three layers of convolutions and two fully connected layers is developed by building input, hidden and output neurons to achieve an improved accuracy. The network is parallelized using a dedicated GPU on CUDA platform using Tensor flow library. A comparative analysis of accuracy and computation time is performed for sequential and parallel execution of the network on dual core (4 logical processors) CPU, octa core (16 logical processors) only CPU and octa core (16 logical processors) CPU with GPU systems. MNIST (Modified National Institute of Standards and Technology) and EMNIST (Extended MNIST) database are used for both training and testing. MNIST has 55000 training sets, 10000 testing sets and 5000 validation sets. EMNIST consists of 235000 training, 40000 testing and 5000 validation sets. The network designed requires high computation and hence parallelizing it shows significant improvement in execution time.
提出了一种基于CUDA的深度卷积神经网络(DCNN)的数字识别方法,以减少人工神经网络的计算时间,达到较高的识别精度。通过构建输入、隐藏和输出神经元,构建了具有三层卷积和两层全连接的神经网络,以提高准确率。该网络使用CUDA平台上的专用GPU使用Tensor flow库进行并行化。在双核(4个逻辑处理器)CPU、单八核(16个逻辑处理器)CPU和带GPU系统的八核(16个逻辑处理器)CPU上对网络的顺序和并行执行进行了精度和计算时间的比较分析。MNIST (Modified National Institute of Standards and Technology)和EMNIST (Extended MNIST)数据库用于培训和测试。MNIST有55000个训练集,10000个测试集和5000个验证集。EMNIST由235000个训练集、40000个测试集和5000个验证集组成。设计的网络需要高计算量,因此并行化在执行时间上有显著改善。