Sheng-Yuan Cheng, Chun-Ping Chung, Robert Lai, Jenq-Kuen Lee
{"title":"Application Showcases for TVM with NeuroPilot on Mobile Devices","authors":"Sheng-Yuan Cheng, Chun-Ping Chung, Robert Lai, Jenq-Kuen Lee","doi":"10.1145/3547276.3548514","DOIUrl":null,"url":null,"abstract":"With the increasing demand for machine learning inference on mobile devices, more platforms are emerging to provide AI inferences on mobile devices. One of the popular ones is TVM, which is an end-to-end AI compiler. The major drawback is TVM doesn’t support all manufacturer-supplied accelerators. On the other hand, an AI solution for MediaTek’s platform, NeuroPilot, offers inference on mobile devices with high performance. Nevertheless, NeuroPilot does not support all of the common machine learning frameworks. Therefore, we want to take advantage of both sides. This way, the solution could accept a variety of machine learning frameworks, including Tensorflow, Pytorch, ONNX, and MxNet and utilize the AI accelerator from MediaTek. We adopt the TVM BYOC flow to implement the solution. In order to illustrate the ability to accept different machine learning frameworks for different tasks, we used three different models to build an application showcase in this work: the face anti-spoofing model from PyTorch, the emotion detection model from Keras, and the object detection model from Tflite. Since these models have dependencies while running inference, we propose a prototype of pipeline algorithm to improve the inference performance of the application showcase.","PeriodicalId":255540,"journal":{"name":"Workshop Proceedings of the 51st International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3547276.3548514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the increasing demand for machine learning inference on mobile devices, more platforms are emerging to provide AI inferences on mobile devices. One of the popular ones is TVM, which is an end-to-end AI compiler. The major drawback is TVM doesn’t support all manufacturer-supplied accelerators. On the other hand, an AI solution for MediaTek’s platform, NeuroPilot, offers inference on mobile devices with high performance. Nevertheless, NeuroPilot does not support all of the common machine learning frameworks. Therefore, we want to take advantage of both sides. This way, the solution could accept a variety of machine learning frameworks, including Tensorflow, Pytorch, ONNX, and MxNet and utilize the AI accelerator from MediaTek. We adopt the TVM BYOC flow to implement the solution. In order to illustrate the ability to accept different machine learning frameworks for different tasks, we used three different models to build an application showcase in this work: the face anti-spoofing model from PyTorch, the emotion detection model from Keras, and the object detection model from Tflite. Since these models have dependencies while running inference, we propose a prototype of pipeline algorithm to improve the inference performance of the application showcase.