{"title":"Partitioning Convolutional Neural Networks for Inference on Constrained Internet-of-Things Devices","authors":"F. M. C. D. Oliveira, E. Borin","doi":"10.1109/CAHPC.2018.8645927","DOIUrl":null,"url":null,"abstract":"With the prospects of a world in which the IoT will be pervasive in a near future, the great amount of data produced by its devices will have to be processed and interpreted in an efficient and intelligent way. One approach to do that is the use of fog computing, in which the network infrastructure and the devices themselves can process data. Deep learning techniques have been successfully applied to the interpretation of the kind of data generated by the IoT, however, even the inference execution of convolutional neural networks may be computationally costly when resource-limited devices are considered. In order to enable the execution of neural network models on resource-constrained IoT systems, the code may be partitioned and distributed among multiple devices. Different partitioning approaches are possible, nonetheless, some of them increase the amount of communication that needs to be performed between the IoT devices. In this work, we propose KLP, a Kernighan-and-Lin-based partitioning algorithm that partitions neural network models for efficient distributed execution on multiple IoT devices. Our results show that KLP is capable of producing partitions that require up to 4.5 times less communication than partitioning approaches used by TensorFlow and other frameworks.","PeriodicalId":307747,"journal":{"name":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAHPC.2018.8645927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
With the prospects of a world in which the IoT will be pervasive in a near future, the great amount of data produced by its devices will have to be processed and interpreted in an efficient and intelligent way. One approach to do that is the use of fog computing, in which the network infrastructure and the devices themselves can process data. Deep learning techniques have been successfully applied to the interpretation of the kind of data generated by the IoT, however, even the inference execution of convolutional neural networks may be computationally costly when resource-limited devices are considered. In order to enable the execution of neural network models on resource-constrained IoT systems, the code may be partitioned and distributed among multiple devices. Different partitioning approaches are possible, nonetheless, some of them increase the amount of communication that needs to be performed between the IoT devices. In this work, we propose KLP, a Kernighan-and-Lin-based partitioning algorithm that partitions neural network models for efficient distributed execution on multiple IoT devices. Our results show that KLP is capable of producing partitions that require up to 4.5 times less communication than partitioning approaches used by TensorFlow and other frameworks.