{"title":"Parallel and Distributed Training of Deep Neural Networks: A brief overview","authors":"Attila Farkas, Gábor Kertész, R. Lovas","doi":"10.1109/INES49302.2020.9147123","DOIUrl":null,"url":null,"abstract":"Deep neural networks and deep learning are becoming important and popular techniques in modern services and applications. The training of these networks is computationally intensive, because of the extreme number of trainable parameters and the large amount of training samples. In this brief overview, current solutions aiming to speed up this training process via parallel and distributed computation are introduced. The necessary components and strategies are described from the low-level communication protocols to the high-level frameworks for the distributed deep learning. The current implementations of the deep learning frameworks with distributed computational capabilities are compared and key parameters are identified to help design effective solutions.","PeriodicalId":175830,"journal":{"name":"2020 IEEE 24th International Conference on Intelligent Engineering Systems (INES)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 24th International Conference on Intelligent Engineering Systems (INES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INES49302.2020.9147123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Deep neural networks and deep learning are becoming important and popular techniques in modern services and applications. The training of these networks is computationally intensive, because of the extreme number of trainable parameters and the large amount of training samples. In this brief overview, current solutions aiming to speed up this training process via parallel and distributed computation are introduced. The necessary components and strategies are described from the low-level communication protocols to the high-level frameworks for the distributed deep learning. The current implementations of the deep learning frameworks with distributed computational capabilities are compared and key parameters are identified to help design effective solutions.