Ioannis Panopoulos, Stylianos I. Venieris, I. Venieris
{"title":"CARIn:在异构设备上针对单 DNN 和多 DNN 工作负载进行约束感知和响应式推理","authors":"Ioannis Panopoulos, Stylianos I. Venieris, I. Venieris","doi":"10.1145/3665868","DOIUrl":null,"url":null,"abstract":"\n The relentless expansion of deep learning (DL) applications in recent years has prompted a pivotal shift towards on-device execution, driven by the urgent need for real-time processing, heightened privacy concerns, and reduced latency across diverse domains. This paper addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices, with a focus on device heterogeneity, multi-DNN execution, and dynamic runtime adaptation. We introduce\n CARIn\n , a novel framework designed for the optimised deployment of both single- and multi-DNN applications under user-defined service-level objectives (SLOs). Leveraging an expressive multi-objective optimisation (MOO) framework and a runtime-aware sorting and search algorithm (\n RASS\n ) as the MOO solver,\n CARIn\n facilitates efficient adaptation to dynamic conditions while addressing resource contention issues associated with multi-DNN execution. Notably,\n RASS\n generates a set of configurations, anticipating subsequent runtime adaptation, ensuring rapid, low-overhead adjustments in response to environmental fluctuations. Extensive evaluation across diverse tasks, including text classification, scene recognition, and face analysis, showcases the versatility of\n CARIn\n across various model architectures, such as Convolutional Neural Networks (CNNs) and Transformers, and realistic use cases. We observe a substantial enhancement in the fair treatment of the problem’s objectives, reaching 1.92 × when compared to single-model designs, and up to 10.69 × in contrast to the state-of-the-art OODIn framework. Additionally, we achieve a significant gain of up to 4.06 × over hardware-unaware designs in multi-DNN applications. Finally, our framework sustains its performance while effectively eliminating the time overhead associated with identifying the optimal design in response to environmental challenges.\n","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":"8 7","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads\",\"authors\":\"Ioannis Panopoulos, Stylianos I. Venieris, I. Venieris\",\"doi\":\"10.1145/3665868\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The relentless expansion of deep learning (DL) applications in recent years has prompted a pivotal shift towards on-device execution, driven by the urgent need for real-time processing, heightened privacy concerns, and reduced latency across diverse domains. This paper addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices, with a focus on device heterogeneity, multi-DNN execution, and dynamic runtime adaptation. We introduce\\n CARIn\\n , a novel framework designed for the optimised deployment of both single- and multi-DNN applications under user-defined service-level objectives (SLOs). Leveraging an expressive multi-objective optimisation (MOO) framework and a runtime-aware sorting and search algorithm (\\n RASS\\n ) as the MOO solver,\\n CARIn\\n facilitates efficient adaptation to dynamic conditions while addressing resource contention issues associated with multi-DNN execution. Notably,\\n RASS\\n generates a set of configurations, anticipating subsequent runtime adaptation, ensuring rapid, low-overhead adjustments in response to environmental fluctuations. Extensive evaluation across diverse tasks, including text classification, scene recognition, and face analysis, showcases the versatility of\\n CARIn\\n across various model architectures, such as Convolutional Neural Networks (CNNs) and Transformers, and realistic use cases. We observe a substantial enhancement in the fair treatment of the problem’s objectives, reaching 1.92 × when compared to single-model designs, and up to 10.69 × in contrast to the state-of-the-art OODIn framework. Additionally, we achieve a significant gain of up to 4.06 × over hardware-unaware designs in multi-DNN applications. Finally, our framework sustains its performance while effectively eliminating the time overhead associated with identifying the optimal design in response to environmental challenges.\\n\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":\"8 7\",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3665868\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3665868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads
The relentless expansion of deep learning (DL) applications in recent years has prompted a pivotal shift towards on-device execution, driven by the urgent need for real-time processing, heightened privacy concerns, and reduced latency across diverse domains. This paper addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices, with a focus on device heterogeneity, multi-DNN execution, and dynamic runtime adaptation. We introduce
CARIn
, a novel framework designed for the optimised deployment of both single- and multi-DNN applications under user-defined service-level objectives (SLOs). Leveraging an expressive multi-objective optimisation (MOO) framework and a runtime-aware sorting and search algorithm (
RASS
) as the MOO solver,
CARIn
facilitates efficient adaptation to dynamic conditions while addressing resource contention issues associated with multi-DNN execution. Notably,
RASS
generates a set of configurations, anticipating subsequent runtime adaptation, ensuring rapid, low-overhead adjustments in response to environmental fluctuations. Extensive evaluation across diverse tasks, including text classification, scene recognition, and face analysis, showcases the versatility of
CARIn
across various model architectures, such as Convolutional Neural Networks (CNNs) and Transformers, and realistic use cases. We observe a substantial enhancement in the fair treatment of the problem’s objectives, reaching 1.92 × when compared to single-model designs, and up to 10.69 × in contrast to the state-of-the-art OODIn framework. Additionally, we achieve a significant gain of up to 4.06 × over hardware-unaware designs in multi-DNN applications. Finally, our framework sustains its performance while effectively eliminating the time overhead associated with identifying the optimal design in response to environmental challenges.
期刊介绍:
ACS Applied Bio Materials is an interdisciplinary journal publishing original research covering all aspects of biomaterials and biointerfaces including and beyond the traditional biosensing, biomedical and therapeutic applications.
The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrates knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important bio applications. The journal is specifically interested in work that addresses the relationship between structure and function and assesses the stability and degradation of materials under relevant environmental and biological conditions.