Application Showcases for TVM with NeuroPilot on Mobile Devices

Workshop Proceedings of the 51st International Conference on Parallel Processing Pub Date : 2022-08-29 DOI:10.1145/3547276.3548514

Sheng-Yuan Cheng, Chun-Ping Chung, Robert Lai, Jenq-Kuen Lee

{"title":"Application Showcases for TVM with NeuroPilot on Mobile Devices","authors":"Sheng-Yuan Cheng, Chun-Ping Chung, Robert Lai, Jenq-Kuen Lee","doi":"10.1145/3547276.3548514","DOIUrl":null,"url":null,"abstract":"With the increasing demand for machine learning inference on mobile devices, more platforms are emerging to provide AI inferences on mobile devices. One of the popular ones is TVM, which is an end-to-end AI compiler. The major drawback is TVM doesn’t support all manufacturer-supplied accelerators. On the other hand, an AI solution for MediaTek’s platform, NeuroPilot, offers inference on mobile devices with high performance. Nevertheless, NeuroPilot does not support all of the common machine learning frameworks. Therefore, we want to take advantage of both sides. This way, the solution could accept a variety of machine learning frameworks, including Tensorflow, Pytorch, ONNX, and MxNet and utilize the AI accelerator from MediaTek. We adopt the TVM BYOC flow to implement the solution. In order to illustrate the ability to accept different machine learning frameworks for different tasks, we used three different models to build an application showcase in this work: the face anti-spoofing model from PyTorch, the emotion detection model from Keras, and the object detection model from Tflite. Since these models have dependencies while running inference, we propose a prototype of pipeline algorithm to improve the inference performance of the application showcase.","PeriodicalId":255540,"journal":{"name":"Workshop Proceedings of the 51st International Conference on Parallel Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3547276.3548514","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing demand for machine learning inference on mobile devices, more platforms are emerging to provide AI inferences on mobile devices. One of the popular ones is TVM, which is an end-to-end AI compiler. The major drawback is TVM doesn’t support all manufacturer-supplied accelerators. On the other hand, an AI solution for MediaTek’s platform, NeuroPilot, offers inference on mobile devices with high performance. Nevertheless, NeuroPilot does not support all of the common machine learning frameworks. Therefore, we want to take advantage of both sides. This way, the solution could accept a variety of machine learning frameworks, including Tensorflow, Pytorch, ONNX, and MxNet and utilize the AI accelerator from MediaTek. We adopt the TVM BYOC flow to implement the solution. In order to illustrate the ability to accept different machine learning frameworks for different tasks, we used three different models to build an application showcase in this work: the face anti-spoofing model from PyTorch, the emotion detection model from Keras, and the object detection model from Tflite. Since these models have dependencies while running inference, we propose a prototype of pipeline algorithm to improve the inference performance of the application showcase.

查看原文本刊更多论文

移动设备上的TVM与NeuroPilot的应用展示

随着移动设备对机器学习推理的需求不断增加，越来越多的平台开始在移动设备上提供人工智能推理。其中一个流行的是TVM，它是一个端到端的AI编译器。主要缺点是TVM不支持所有制造商提供的加速器。另一方面，联发科平台的人工智能解决方案NeuroPilot在高性能移动设备上提供推理。然而，NeuroPilot并不支持所有常见的机器学习框架。因此，我们希望利用双方的优势。这样，解决方案可以接受各种机器学习框架，包括Tensorflow, Pytorch, ONNX和MxNet，并利用联发科的AI加速器。我们采用TVM BYOC流程来实现该解决方案。为了说明不同任务接受不同机器学习框架的能力，我们在这项工作中使用了三种不同的模型来构建应用程序展示:PyTorch的面部反欺骗模型，Keras的情感检测模型和Tflite的对象检测模型。由于这些模型在运行推理时存在依赖关系，我们提出了一种管道算法的原型，以提高应用程序展示的推理性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop Proceedings of the 51st International Conference on Parallel Processing

自引率

0.00%

发文量