面向异构MPSoC部署的映射感知图神经结构搜索框架

IF 2.8 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Mohanad Odema, Halima Bouzidi, Hamza Ouarnoughi, Smail Niar, Mohammad Abdullah Al Faruque
{"title":"面向异构MPSoC部署的映射感知图神经结构搜索框架","authors":"Mohanad Odema, Halima Bouzidi, Hamza Ouarnoughi, Smail Niar, Mohammad Abdullah Al Faruque","doi":"10.1145/3609386","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) are becoming increasingly popular for vision-based applications due to their intrinsic capacity in modeling structural and contextual relations between various parts of an image frame. On another front, the rising popularity of deep vision-based applications at the edge has been facilitated by the recent advancements in heterogeneous multi-processor Systems on Chips (MPSoCs) that enable inference under real-time, stringent execution requirements. By extension, GNNs employed for vision-based applications must adhere to the same execution requirements. Yet contrary to typical deep neural networks, the irregular flow of graph learning operations poses a challenge to running GNNs on such heterogeneous MPSoC platforms. In this paper, we propose a novel unified design-mapping approach for efficient processing of vision GNN workloads on heterogeneous MPSoC platforms. Particularly, we develop MaGNAS, a mapping-aware Graph Neural Architecture Search framework. MaGNAS proposes a GNN architectural design space coupled with prospective mapping options on a heterogeneous SoC to identify model architectures that maximize on-device resource efficiency. To achieve this, MaGNAS employs a two-tier evolutionary search to identify optimal GNNs and mapping pairings that yield the best performance trade-offs. Through designing a supernet derived from the recent Vision GNN (ViG) architecture, we conducted experiments on four (04) state-of-the-art vision datasets using both ( i ) a real hardware SoC platform (NVIDIA Xavier AGX) and ( ii ) a performance/cost model simulator for DNN accelerators. Our experimental results demonstrate that MaGNAS is able to provide 1.57 × latency speedup and is 3.38 × more energy-efficient for several vision datasets executed on the Xavier MPSoC vs. the GPU-only deployment while sustaining an average 0.11% accuracy reduction from the baseline.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"2016 1","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MaGNAS: A Mapping-Aware Graph Neural Architecture Search Framework for Heterogeneous MPSoC Deployment\",\"authors\":\"Mohanad Odema, Halima Bouzidi, Hamza Ouarnoughi, Smail Niar, Mohammad Abdullah Al Faruque\",\"doi\":\"10.1145/3609386\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph Neural Networks (GNNs) are becoming increasingly popular for vision-based applications due to their intrinsic capacity in modeling structural and contextual relations between various parts of an image frame. On another front, the rising popularity of deep vision-based applications at the edge has been facilitated by the recent advancements in heterogeneous multi-processor Systems on Chips (MPSoCs) that enable inference under real-time, stringent execution requirements. By extension, GNNs employed for vision-based applications must adhere to the same execution requirements. Yet contrary to typical deep neural networks, the irregular flow of graph learning operations poses a challenge to running GNNs on such heterogeneous MPSoC platforms. In this paper, we propose a novel unified design-mapping approach for efficient processing of vision GNN workloads on heterogeneous MPSoC platforms. Particularly, we develop MaGNAS, a mapping-aware Graph Neural Architecture Search framework. MaGNAS proposes a GNN architectural design space coupled with prospective mapping options on a heterogeneous SoC to identify model architectures that maximize on-device resource efficiency. To achieve this, MaGNAS employs a two-tier evolutionary search to identify optimal GNNs and mapping pairings that yield the best performance trade-offs. Through designing a supernet derived from the recent Vision GNN (ViG) architecture, we conducted experiments on four (04) state-of-the-art vision datasets using both ( i ) a real hardware SoC platform (NVIDIA Xavier AGX) and ( ii ) a performance/cost model simulator for DNN accelerators. Our experimental results demonstrate that MaGNAS is able to provide 1.57 × latency speedup and is 3.38 × more energy-efficient for several vision datasets executed on the Xavier MPSoC vs. the GPU-only deployment while sustaining an average 0.11% accuracy reduction from the baseline.\",\"PeriodicalId\":50914,\"journal\":{\"name\":\"ACM Transactions on Embedded Computing Systems\",\"volume\":\"2016 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Embedded Computing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3609386\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3609386","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 1

摘要

图神经网络(gnn)在基于视觉的应用中越来越受欢迎,因为它们具有对图像框架各部分之间的结构和上下文关系进行建模的内在能力。另一方面,基于深度视觉的边缘应用的日益普及,得益于异构多处理器片上系统(mpsoc)的最新进展,这些应用能够在实时、严格的执行要求下进行推理。通过扩展,用于基于视觉的应用程序的gnn必须遵守相同的执行要求。然而,与典型的深度神经网络相反,图学习操作的不规则流给在这种异构MPSoC平台上运行gnn带来了挑战。在本文中,我们提出了一种新的统一设计映射方法,用于在异构MPSoC平台上高效处理视觉GNN工作负载。特别地,我们开发了MaGNAS,一个映射感知的图神经结构搜索框架。MaGNAS提出了一个GNN架构设计空间,结合异构SoC上的前瞻性映射选项,以确定最大限度提高设备上资源效率的模型架构。为了实现这一目标,MaGNAS采用两层进化搜索来识别最佳gnn和映射配对,从而产生最佳的性能权衡。通过设计一个源自最新视觉GNN (ViG)架构的超级网络,我们使用(i)真正的硬件SoC平台(NVIDIA Xavier AGX)和(ii) DNN加速器的性能/成本模型模拟器在四(04)个最先进的视觉数据集上进行了实验。我们的实验结果表明,对于在Xavier MPSoC上执行的多个视觉数据集,与仅使用gpu的部署相比,MaGNAS能够提供1.57倍的延迟加速和3.38倍的能效,同时保持比基线平均0.11%的精度降低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MaGNAS: A Mapping-Aware Graph Neural Architecture Search Framework for Heterogeneous MPSoC Deployment
Graph Neural Networks (GNNs) are becoming increasingly popular for vision-based applications due to their intrinsic capacity in modeling structural and contextual relations between various parts of an image frame. On another front, the rising popularity of deep vision-based applications at the edge has been facilitated by the recent advancements in heterogeneous multi-processor Systems on Chips (MPSoCs) that enable inference under real-time, stringent execution requirements. By extension, GNNs employed for vision-based applications must adhere to the same execution requirements. Yet contrary to typical deep neural networks, the irregular flow of graph learning operations poses a challenge to running GNNs on such heterogeneous MPSoC platforms. In this paper, we propose a novel unified design-mapping approach for efficient processing of vision GNN workloads on heterogeneous MPSoC platforms. Particularly, we develop MaGNAS, a mapping-aware Graph Neural Architecture Search framework. MaGNAS proposes a GNN architectural design space coupled with prospective mapping options on a heterogeneous SoC to identify model architectures that maximize on-device resource efficiency. To achieve this, MaGNAS employs a two-tier evolutionary search to identify optimal GNNs and mapping pairings that yield the best performance trade-offs. Through designing a supernet derived from the recent Vision GNN (ViG) architecture, we conducted experiments on four (04) state-of-the-art vision datasets using both ( i ) a real hardware SoC platform (NVIDIA Xavier AGX) and ( ii ) a performance/cost model simulator for DNN accelerators. Our experimental results demonstrate that MaGNAS is able to provide 1.57 × latency speedup and is 3.38 × more energy-efficient for several vision datasets executed on the Xavier MPSoC vs. the GPU-only deployment while sustaining an average 0.11% accuracy reduction from the baseline.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems 工程技术-计算机:软件工程
CiteScore
3.70
自引率
0.00%
发文量
138
审稿时长
6 months
期刊介绍: The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信