Missing the Forest for the Trees: End-to-End AI Application Performance in Edge Data Centers

Daniel Richins, Dharmisha Doshi, Matthew Blackmore, A. Nair, Neha Pathapati, Ankit Patel, Brainard Daguman, Daniel Dobrijalowski, R. Illikkal, Kevin Long, David Zimmerman, V. Reddi
{"title":"Missing the Forest for the Trees: End-to-End AI Application Performance in Edge Data Centers","authors":"Daniel Richins, Dharmisha Doshi, Matthew Blackmore, A. Nair, Neha Pathapati, Ankit Patel, Brainard Daguman, Daniel Dobrijalowski, R. Illikkal, Kevin Long, David Zimmerman, V. Reddi","doi":"10.1109/HPCA47549.2020.00049","DOIUrl":null,"url":null,"abstract":"Artificial intelligence and machine learning are experiencing widespread adoption in the industry, academia, and even public consciousness. This has been driven by the rapid advances in the applications and accuracy of AI through increasingly complex algorithms and models; this, in turn, has spurred research into developing specialized hardware AI accelerators. The rapid pace of the advances makes it easy to miss the forest for the trees: they are often developed and evaluated in a vacuum without considering the full application environment in which they must eventually operate. In this paper, we deploy and characterize Face Recognition, an AI-centric edge video analytics application built using open source and widely adopted infrastructure and ML tools. We evaluate its holistic, end-to-end behavior in a production-size edge data center and reveal the \"AI tax\" for all the processing that is involved. Even though the application is built around state-of-the-art AI and ML algorithms, it relies heavily on pre-and post-processing code which must be executed on a general-purpose CPU. As AI-centric applications start to reap the acceleration promised by so many accelerators, we find they impose stresses on the underlying software infrastructure and the data center's capabilities: storage and network bandwidth become major bottlenecks with increasing AI acceleration. By not having to serve a wide variety of applications, we show that a purpose-built edge data center can be designed to accommodate the stresses of accelerated AI at 15% lower TCO than one derived from homogeneous servers and infrastructure. We also discuss how our conclusions generalize beyond Face Recognition as many AI-centric applications at the edge rely upon the same underlying software and hardware infrastructure.","PeriodicalId":339648,"journal":{"name":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA47549.2020.00049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

Artificial intelligence and machine learning are experiencing widespread adoption in the industry, academia, and even public consciousness. This has been driven by the rapid advances in the applications and accuracy of AI through increasingly complex algorithms and models; this, in turn, has spurred research into developing specialized hardware AI accelerators. The rapid pace of the advances makes it easy to miss the forest for the trees: they are often developed and evaluated in a vacuum without considering the full application environment in which they must eventually operate. In this paper, we deploy and characterize Face Recognition, an AI-centric edge video analytics application built using open source and widely adopted infrastructure and ML tools. We evaluate its holistic, end-to-end behavior in a production-size edge data center and reveal the "AI tax" for all the processing that is involved. Even though the application is built around state-of-the-art AI and ML algorithms, it relies heavily on pre-and post-processing code which must be executed on a general-purpose CPU. As AI-centric applications start to reap the acceleration promised by so many accelerators, we find they impose stresses on the underlying software infrastructure and the data center's capabilities: storage and network bandwidth become major bottlenecks with increasing AI acceleration. By not having to serve a wide variety of applications, we show that a purpose-built edge data center can be designed to accommodate the stresses of accelerated AI at 15% lower TCO than one derived from homogeneous servers and infrastructure. We also discuss how our conclusions generalize beyond Face Recognition as many AI-centric applications at the edge rely upon the same underlying software and hardware infrastructure.
只见树木不见森林:边缘数据中心的端到端人工智能应用性能
人工智能和机器学习在工业界、学术界甚至公众意识中都得到了广泛的应用。这是由人工智能应用的快速发展和通过日益复杂的算法和模型的准确性所推动的;这反过来又刺激了开发专用硬件人工智能加速器的研究。技术进步的快速步伐使得人们很容易只见树木不见森林:它们往往是在真空中开发和评估的,而没有考虑到它们最终必须在其中运行的完整应用环境。在本文中,我们部署和描述了人脸识别,这是一个以人工智能为中心的边缘视频分析应用程序,使用开源和广泛采用的基础设施和ML工具构建。我们在生产规模的边缘数据中心中评估其整体的端到端行为,并揭示所有涉及的处理的“人工智能税”。尽管应用程序是围绕最先进的AI和ML算法构建的,但它严重依赖于必须在通用CPU上执行的预处理和后处理代码。随着以人工智能为中心的应用程序开始获得许多加速器所承诺的加速,我们发现它们对底层软件基础设施和数据中心的能力施加了压力:随着人工智能加速的增加,存储和网络带宽成为主要瓶颈。由于不必服务于各种各样的应用程序,我们表明,专用的边缘数据中心可以设计为适应加速人工智能的压力,其TCO比同质服务器和基础设施的TCO低15%。我们还讨论了我们的结论如何推广到人脸识别之外,因为边缘的许多以人工智能为中心的应用程序依赖于相同的底层软件和硬件基础设施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信