Are Existing Knowledge Transfer Techniques Effective for Deep Learning with Edge Devices?

2018 IEEE International Conference on Edge Computing (EDGE) Pub Date : 2018-06-11 DOI:10.1145/3220192.3220459

Ragini Sharma, Saman Biookaghazadeh, Baoxin Li, Ming Zhao

{"title":"Are Existing Knowledge Transfer Techniques Effective for Deep Learning with Edge Devices?","authors":"Ragini Sharma, Saman Biookaghazadeh, Baoxin Li, Ming Zhao","doi":"10.1145/3220192.3220459","DOIUrl":null,"url":null,"abstract":"With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational-heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced for deployment on edge devices, but they may lose their capability and not perform well. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking. This paper provides an extensive study on the performance (in both accuracy and convergence speed) of knowledge transfer, considering different student architectures and different techniques for transferring knowledge from teacher to student. The results show that the performance of KT does vary by architectures and transfer techniques. A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact.","PeriodicalId":396887,"journal":{"name":"2018 IEEE International Conference on Edge Computing (EDGE)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Edge Computing (EDGE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3220192.3220459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 39

Abstract

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational-heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced for deployment on edge devices, but they may lose their capability and not perform well. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking. This paper provides an extensive study on the performance (in both accuracy and convergence speed) of knowledge transfer, considering different student architectures and different techniques for transferring knowledge from teacher to student. The results show that the performance of KT does vary by architectures and transfer techniques. A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact.

查看原文本刊更多论文

现有的知识转移技术对边缘设备的深度学习有效吗?

随着边缘计算范式的出现，图像识别和增强现实等许多应用都需要在边缘设备上执行机器学习(ML)和人工智能(AI)任务。大多数人工智能和机器学习模型都很大，计算量很大，而边缘设备通常配备有限的计算和存储资源。为了在边缘设备上部署，这些模型可以被压缩和精简，但它们可能会失去功能，不能很好地执行。最近的研究使用知识转移技术将信息从一个大网络(称为教师)转移到一个小网络(称为学生)，以提高后者的表现。这种方法似乎很有希望在边缘设备上学习，但缺乏对其有效性的彻底调查。本文对知识转移的性能(准确性和收敛速度)进行了广泛的研究，考虑了不同的学生体系结构和不同的教师向学生转移知识的技术。结果表明，KT的性能确实因体系结构和传输技术而异。通过将知识从教师的中间层和最后一层传递给较浅的学生，可以获得较好的绩效提升。但是其他架构和传输技术就没有那么好了，其中一些甚至会导致负面的性能影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE International Conference on Edge Computing (EDGE)

自引率

0.00%

发文量