Graph neural networks with configuration cross-attention for tensor compilers

Dmitrii Khizbullin, Eduardo Rocha de Andrade, Thanh Hau Nguyen, Matheus Pedroza Ferreira, David R. Pugh
{"title":"Graph neural networks with configuration cross-attention for tensor compilers","authors":"Dmitrii Khizbullin, Eduardo Rocha de Andrade, Thanh Hau Nguyen, Matheus Pedroza Ferreira, David R. Pugh","doi":"arxiv-2405.16623","DOIUrl":null,"url":null,"abstract":"With the recent popularity of neural networks comes the need for efficient\nserving of inference workloads. A neural network inference workload can be\nrepresented as a computational graph with nodes as operators transforming\nmultidimensional tensors. The tensors can be transposed and/or tiled in a\ncombinatorially large number of ways, some configurations leading to\naccelerated inference. We propose TGraph, a neural graph architecture that\nallows screening for fast configurations of the target computational graph,\nthus representing an artificial intelligence (AI) tensor compiler in contrast\nto the traditional heuristics-based compilers. The proposed solution improves\nmean Kendall's $\\tau$ across layout collections of TpuGraphs from 29.8% of the\nreliable baseline to 67.4% of TGraph. We estimate the potential CO$_2$ emission\nreduction associated with our work to be equivalent to over 50% of the total\nhousehold emissions in the areas hosting AI-oriented data centers.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"66 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.16623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the recent popularity of neural networks comes the need for efficient serving of inference workloads. A neural network inference workload can be represented as a computational graph with nodes as operators transforming multidimensional tensors. The tensors can be transposed and/or tiled in a combinatorially large number of ways, some configurations leading to accelerated inference. We propose TGraph, a neural graph architecture that allows screening for fast configurations of the target computational graph, thus representing an artificial intelligence (AI) tensor compiler in contrast to the traditional heuristics-based compilers. The proposed solution improves mean Kendall's $\tau$ across layout collections of TpuGraphs from 29.8% of the reliable baseline to 67.4% of TGraph. We estimate the potential CO$_2$ emission reduction associated with our work to be equivalent to over 50% of the total household emissions in the areas hosting AI-oriented data centers.
为张量编译器配置交叉注意的图神经网络
近年来,随着神经网络的普及,人们需要为推理工作负载提供高效服务。神经网络推理工作负载可以表示为一个计算图,节点是变换多维张量的算子。这些张量可以通过大量组合方式进行转置和/或平铺,其中一些配置可以加快推理速度。我们提出的 TGraph 是一种神经图架构,它允许筛选目标计算图的快速配置,从而代表了一种人工智能(AI)张量编译器,与传统的基于启发式的编译器形成鲜明对比。所提出的解决方案提高了 TpuGraph 布局集合的平均 Kendall's $\tau$ 值,从可靠基线的 29.8% 提高到 TGraph 的 67.4%。我们估计,与我们的工作相关的潜在 CO$_2$ 减排量相当于面向人工智能的数据中心所在地区家庭总排放量的 50%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信