ParallelFusion

Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning Pub Date : 2021-06-24 DOI:10.1145/3469116.3470014

Jingyu Lee, Yunxin Liu, Youngki Lee

引用次数: 4

Abstract

Mobile GPUs are extremely under-utilized for DNN computations across different mobile deep learning frameworks and multiple DNNs with various complexities. We explore the feasibility of batching and it improves the throughput by up to 35%. However, real-time applications in mobile have a limited amount of requests to get a benefit from batching. To tackle the challenge, we present ParallelFusion technique that enables concurrent execution of heterogeneous operators to further utilize the mobile GPU. We implemented ParallelFusion over the MNN framework and evaluated on 6 state-of-the-art DNNs. Our evaluation shows that Parallel Fusion achieves up to 195% to 218% throughput with fused execution of 2 and 3 operators compared to single DNN inference.

查看原文本刊更多论文

ParallelFusion

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning

自引率

0.00%

发文量