Centralized Generic Interfaces in Hardware/Software Co-design for AI Accelerators

Dongju Chae, Parichay Kapoor
{"title":"Centralized Generic Interfaces in Hardware/Software Co-design for AI Accelerators","authors":"Dongju Chae, Parichay Kapoor","doi":"10.1145/3387940.3392225","DOIUrl":null,"url":null,"abstract":"A hardware/software co-design for AI accelerators such as Neural Processing Unit (NPU) is essential not only to support the required functionality but also to meet primary goals of improved performance and power efficiency. However, their ever-changing requirements often introduce undesirable development costs. Indeed, it is quite challenging for developers from different backgrounds to efficiently work together to construct a full HW/SW stack to develop AI accelerators. This paper addresses these challenges, and proposes a centralized collaboration methodology for efficient full-stack development, especially targeting NPU HW. The proposal is inspired based on the observations from our experiences, presented later as a case study. As not all of the involved developers have enough knowledge of software engineering, this approach suggests making a central development group (e.g., runtime system software) have a higher priority to organize and devise common interfaces including APIs for each layer in the full-stack. This aims to minimize unnecessary discussions between development groups and hide any minor updates introduced with each new design, reducing the overall development costs and improving the quality of products. More importantly, each development group can focus on their work as much as possible with this approach.","PeriodicalId":309659,"journal":{"name":"Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3387940.3392225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

A hardware/software co-design for AI accelerators such as Neural Processing Unit (NPU) is essential not only to support the required functionality but also to meet primary goals of improved performance and power efficiency. However, their ever-changing requirements often introduce undesirable development costs. Indeed, it is quite challenging for developers from different backgrounds to efficiently work together to construct a full HW/SW stack to develop AI accelerators. This paper addresses these challenges, and proposes a centralized collaboration methodology for efficient full-stack development, especially targeting NPU HW. The proposal is inspired based on the observations from our experiences, presented later as a case study. As not all of the involved developers have enough knowledge of software engineering, this approach suggests making a central development group (e.g., runtime system software) have a higher priority to organize and devise common interfaces including APIs for each layer in the full-stack. This aims to minimize unnecessary discussions between development groups and hide any minor updates introduced with each new design, reducing the overall development costs and improving the quality of products. More importantly, each development group can focus on their work as much as possible with this approach.
人工智能加速器软硬件协同设计中的集中通用接口
人工智能加速器(如神经处理单元(NPU))的硬件/软件协同设计不仅对于支持所需的功能,而且对于满足改进性能和能效的主要目标至关重要。然而,它们不断变化的需求通常会引入不必要的开发成本。事实上,对于来自不同背景的开发人员来说,有效地共同构建一个完整的硬件/软件堆栈来开发AI加速器是相当具有挑战性的。本文解决了这些挑战,并提出了一种高效全栈开发的集中协作方法,特别是针对NPU硬件。该建议的灵感来自我们的经验观察,稍后将作为案例研究提出。由于并非所有参与的开发人员都有足够的软件工程知识,这种方法建议让一个中央开发小组(例如,运行时系统软件)有更高的优先权来组织和设计公共接口,包括全栈中每一层的api。这样做的目的是尽量减少开发小组之间不必要的讨论,并隐藏每个新设计引入的任何次要更新,从而降低总体开发成本并提高产品质量。更重要的是,使用这种方法,每个开发小组可以尽可能多地关注他们的工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信