跨异构特征的高效张量程序生成迁移学习

Gaurav Verma, Siddhisanket Raskar, Zhenda Xie, A. Malik, M. Emani, Barbara M. Chapman
{"title":"跨异构特征的高效张量程序生成迁移学习","authors":"Gaurav Verma, Siddhisanket Raskar, Zhenda Xie, A. Malik, M. Emani, Barbara M. Chapman","doi":"10.1145/3587278.3595644","DOIUrl":null,"url":null,"abstract":"Tuning tensor program generation involves searching for various possible program transformation combinations for a given program on target hardware to optimize the tensor program execution. It is already a complex process because of the massive search space, and exponential combinations of transformations make auto-tuning tensor program generation more challenging, especially when we have a heterogeneous target. In this research, we attempt to address these problems by learning the joint neural network and hardware features and transferring them to the new target hardware. We extensively study the existing state-of-the-art dataset, TenSet, perform comparative analysis on the test split strategies and propose methodologies to prune the dataset. We adopt an attention-inspired approach for tuning the tensor programs enabling them to embed neural network and hardware-specific features. Our approach could prune the dataset up to 45% of the baseline without compromising the Pairwise Comparison Accuracy (PCA). Further, the proposed methodology can achieve on-par or improved mean inference time with 25%-40% of the baseline tuning time across different networks and target hardware.","PeriodicalId":169613,"journal":{"name":"Proceedings of the 2nd International Workshop on Extreme Heterogeneity Solutions","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transfer Learning Across Heterogeneous Features For Efficient Tensor Program Generation\",\"authors\":\"Gaurav Verma, Siddhisanket Raskar, Zhenda Xie, A. Malik, M. Emani, Barbara M. Chapman\",\"doi\":\"10.1145/3587278.3595644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tuning tensor program generation involves searching for various possible program transformation combinations for a given program on target hardware to optimize the tensor program execution. It is already a complex process because of the massive search space, and exponential combinations of transformations make auto-tuning tensor program generation more challenging, especially when we have a heterogeneous target. In this research, we attempt to address these problems by learning the joint neural network and hardware features and transferring them to the new target hardware. We extensively study the existing state-of-the-art dataset, TenSet, perform comparative analysis on the test split strategies and propose methodologies to prune the dataset. We adopt an attention-inspired approach for tuning the tensor programs enabling them to embed neural network and hardware-specific features. Our approach could prune the dataset up to 45% of the baseline without compromising the Pairwise Comparison Accuracy (PCA). Further, the proposed methodology can achieve on-par or improved mean inference time with 25%-40% of the baseline tuning time across different networks and target hardware.\",\"PeriodicalId\":169613,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Extreme Heterogeneity Solutions\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Extreme Heterogeneity Solutions\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3587278.3595644\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Extreme Heterogeneity Solutions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3587278.3595644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

调优张量程序生成涉及在目标硬件上搜索给定程序的各种可能的程序转换组合,以优化张量程序的执行。由于巨大的搜索空间,这已经是一个复杂的过程,并且变换的指数组合使得自动调优张量程序生成更具挑战性,特别是当我们有一个异构目标时。在本研究中,我们试图通过学习联合神经网络和硬件特征并将其转移到新的目标硬件来解决这些问题。我们广泛研究了现有的最先进的数据集TenSet,对测试分割策略进行了比较分析,并提出了修剪数据集的方法。我们采用了一种注意力启发的方法来调整张量程序,使它们能够嵌入神经网络和硬件特定的功能。我们的方法可以在不影响成对比较精度(PCA)的情况下将数据集修剪到基线的45%。此外,所提出的方法可以在不同网络和目标硬件上以25%-40%的基准调优时间实现同等或改进的平均推理时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Transfer Learning Across Heterogeneous Features For Efficient Tensor Program Generation
Tuning tensor program generation involves searching for various possible program transformation combinations for a given program on target hardware to optimize the tensor program execution. It is already a complex process because of the massive search space, and exponential combinations of transformations make auto-tuning tensor program generation more challenging, especially when we have a heterogeneous target. In this research, we attempt to address these problems by learning the joint neural network and hardware features and transferring them to the new target hardware. We extensively study the existing state-of-the-art dataset, TenSet, perform comparative analysis on the test split strategies and propose methodologies to prune the dataset. We adopt an attention-inspired approach for tuning the tensor programs enabling them to embed neural network and hardware-specific features. Our approach could prune the dataset up to 45% of the baseline without compromising the Pairwise Comparison Accuracy (PCA). Further, the proposed methodology can achieve on-par or improved mean inference time with 25%-40% of the baseline tuning time across different networks and target hardware.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信