Real-Time Scheduling of Machine Learning Operations on Heterogeneous Neuromorphic SoC

Anup Das
{"title":"Real-Time Scheduling of Machine Learning Operations on Heterogeneous Neuromorphic SoC","authors":"Anup Das","doi":"10.1109/MEMOCODE57689.2022.9954596","DOIUrl":null,"url":null,"abstract":"Neuromorphic Systems-on-Chip (NSoCs) are becoming heterogeneous by integrating general-purpose processors (GPPs) and neural processing units (NPUs) on the same SoC. For embedded systems, an NSoC may need to execute user applications built using a variety of machine learning models. We propose a real-time scheduler, called PRISM, which can schedule machine learning models on a heterogeneous NSoC either individually or concurrently to improve their system performance. PRISM consists of the following four key steps. First, it constructs an interprocessor communication (IPC) graph of a machine learning model from a mapping and a self-timed schedule. Second, it creates a transaction order for the communication actors and embeds this order into the IPC graph. Third, it schedules the graph on an NSoC by overlapping communication with the computation. Finally, it uses a Hill Climbing heuristic to explore the design space of mapping operations on GPPs and NPUs to improve the performance. Unlike existing schedulers which use only the NPUs of an NSoC, PRISM improves performance by enabling batch, pipeline, and operation parallelism via exploiting a platform's heterogeneity. For use-cases with concurrent applications, PRISM uses a heuristic resource sharing strategy and a non-preemptive scheduling to reduce the expected wait time before concurrent operations can be scheduled on contending resources. Our extensive evaluations with 20 machine learning workloads show that PRISM significantly improves the performance per watt for both individual applications and use-cases when compared to state-of-the-art schedulers.","PeriodicalId":157326,"journal":{"name":"2022 20th ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MEMOCODE57689.2022.9954596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Neuromorphic Systems-on-Chip (NSoCs) are becoming heterogeneous by integrating general-purpose processors (GPPs) and neural processing units (NPUs) on the same SoC. For embedded systems, an NSoC may need to execute user applications built using a variety of machine learning models. We propose a real-time scheduler, called PRISM, which can schedule machine learning models on a heterogeneous NSoC either individually or concurrently to improve their system performance. PRISM consists of the following four key steps. First, it constructs an interprocessor communication (IPC) graph of a machine learning model from a mapping and a self-timed schedule. Second, it creates a transaction order for the communication actors and embeds this order into the IPC graph. Third, it schedules the graph on an NSoC by overlapping communication with the computation. Finally, it uses a Hill Climbing heuristic to explore the design space of mapping operations on GPPs and NPUs to improve the performance. Unlike existing schedulers which use only the NPUs of an NSoC, PRISM improves performance by enabling batch, pipeline, and operation parallelism via exploiting a platform's heterogeneity. For use-cases with concurrent applications, PRISM uses a heuristic resource sharing strategy and a non-preemptive scheduling to reduce the expected wait time before concurrent operations can be scheduled on contending resources. Our extensive evaluations with 20 machine learning workloads show that PRISM significantly improves the performance per watt for both individual applications and use-cases when compared to state-of-the-art schedulers.
异构神经形态SoC上机器学习操作的实时调度
通过将通用处理器(gpp)和神经处理单元(npu)集成在同一个SoC上,神经形态片上系统(nsoc)正变得异构化。对于嵌入式系统,NSoC可能需要执行使用各种机器学习模型构建的用户应用程序。我们提出了一个实时调度程序,称为PRISM,它可以单独或并发地调度异构NSoC上的机器学习模型,以提高其系统性能。PRISM包括以下四个关键步骤。首先,从映射和自定时调度中构建机器学习模型的处理器间通信(IPC)图。其次,它为通信参与者创建一个事务顺序,并将此顺序嵌入IPC图中。第三,它通过与计算的重叠通信来调度NSoC上的图。最后,利用爬坡启发式方法探索gpp和npu上映射操作的设计空间,以提高性能。与仅使用NSoC的npu的现有调度器不同,PRISM通过利用平台的异构性来实现批处理、管道和操作并行性,从而提高了性能。对于并发应用程序的用例,PRISM使用启发式资源共享策略和非抢占式调度,以减少在竞争资源上调度并发操作之前的预期等待时间。我们对20个机器学习工作负载的广泛评估表明,与最先进的调度器相比,PRISM显著提高了单个应用程序和用例的每瓦性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信