Qiuping Zhang , Sheng Sun , Junjie Luo , Min Liu , Zhongcheng Li , Huan Yang , Yuwei Wang
{"title":"Collaborative non-chain DNN inference with multi-device based on layer parallel","authors":"Qiuping Zhang , Sheng Sun , Junjie Luo , Min Liu , Zhongcheng Li , Huan Yang , Yuwei Wang","doi":"10.1016/j.dcan.2023.11.004","DOIUrl":null,"url":null,"abstract":"<div><div>Various intelligent applications based on non-chain DNN models are widely used in Internet of Things (IoT) scenarios. However, resource-constrained IoT devices usually cannot afford the heavy computation burden and cannot guarantee the strict inference latency requirements of non-chain DNN models. Multi-device collaboration has become a promising paradigm for achieving inference acceleration. However, existing works neglect the possibility of inter-layer parallel execution, which fails to exploit the parallelism of collaborating devices and inevitably prolongs the overall completion latency. Thus, there is an urgent need to pay attention to the issue of non-chain DNN inference acceleration with multi-device collaboration based on inter-layer parallel. Three major challenges to be overcome in this problem include exponential computational complexity, complicated layer dependencies, and intractable execution location selection. To this end, we propose a Topological Sorting Based Bidirectional Search (TSBS) algorithm that can adaptively partition non-chain DNN models and select suitable execution locations at layer granularity. More specifically, the TSBS algorithm consists of a topological sorting subalgorithm to realize parallel execution with low computational complexity under complicated layer parallel constraints, and a bidirectional search subalgorithm to quickly find the suitable execution locations for non-parallel layers. Extensive experiments show that the TSBS algorithm significantly outperforms the state-of-the-arts in the completion latency of non-chain DNN inference, a reduction of up to 22.69%.</div></div>","PeriodicalId":48631,"journal":{"name":"Digital Communications and Networks","volume":"10 6","pages":"Pages 1748-1759"},"PeriodicalIF":7.5000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Communications and Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352864823001645","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Various intelligent applications based on non-chain DNN models are widely used in Internet of Things (IoT) scenarios. However, resource-constrained IoT devices usually cannot afford the heavy computation burden and cannot guarantee the strict inference latency requirements of non-chain DNN models. Multi-device collaboration has become a promising paradigm for achieving inference acceleration. However, existing works neglect the possibility of inter-layer parallel execution, which fails to exploit the parallelism of collaborating devices and inevitably prolongs the overall completion latency. Thus, there is an urgent need to pay attention to the issue of non-chain DNN inference acceleration with multi-device collaboration based on inter-layer parallel. Three major challenges to be overcome in this problem include exponential computational complexity, complicated layer dependencies, and intractable execution location selection. To this end, we propose a Topological Sorting Based Bidirectional Search (TSBS) algorithm that can adaptively partition non-chain DNN models and select suitable execution locations at layer granularity. More specifically, the TSBS algorithm consists of a topological sorting subalgorithm to realize parallel execution with low computational complexity under complicated layer parallel constraints, and a bidirectional search subalgorithm to quickly find the suitable execution locations for non-parallel layers. Extensive experiments show that the TSBS algorithm significantly outperforms the state-of-the-arts in the completion latency of non-chain DNN inference, a reduction of up to 22.69%.
期刊介绍:
Digital Communications and Networks is a prestigious journal that emphasizes on communication systems and networks. We publish only top-notch original articles and authoritative reviews, which undergo rigorous peer-review. We are proud to announce that all our articles are fully Open Access and can be accessed on ScienceDirect. Our journal is recognized and indexed by eminent databases such as the Science Citation Index Expanded (SCIE) and Scopus.
In addition to regular articles, we may also consider exceptional conference papers that have been significantly expanded. Furthermore, we periodically release special issues that focus on specific aspects of the field.
In conclusion, Digital Communications and Networks is a leading journal that guarantees exceptional quality and accessibility for researchers and scholars in the field of communication systems and networks.