Performance Prediction From Source Code Is Task and Domain Specific

Markus Böck, Sarra Habchi, Mathieu Nayrolles, Jürgen Cito
{"title":"Performance Prediction From Source Code Is Task and Domain Specific","authors":"Markus Böck, Sarra Habchi, Mathieu Nayrolles, Jürgen Cito","doi":"10.1109/ICPC58990.2023.00015","DOIUrl":null,"url":null,"abstract":"Performance is key to the success and adoption of software systems. In video games, performance is commonly highlighted as one of the top quality concerns raised by players. To check the performance of their systems, development teams tend to rely on profiling and monitoring tools, which observe program executions to identify regressions. The usage of static analysis tools for this purpose has been so far limited. Lately, the success of Large Language Models in many code analytics tools led to attempts to leverage them in static performance analysis. These studies showed promising results in predicting runtime and regressions on large public datasets. In this paper, we evaluate the usability of such models in practice, and particularly in the domain of video games. We train a state-of-the-art neural network on the Code4Bench dataset to predict runtime regressions for programming competition programs, then evaluate its ability to generalize to new domains. Our results show that these models achieve great results (e.g. 95.73% accuracy for performance comparison) on the original domain for programs solving in-sample programming tasks, yet fail to generalize to out-of-sample tasks. Furthermore, we show that transfer techniques such as domain adversarial adaptation and model fine-tuning are not sufficient to transfer these models to the target industrial domain of AAA games.","PeriodicalId":376593,"journal":{"name":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPC58990.2023.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Performance is key to the success and adoption of software systems. In video games, performance is commonly highlighted as one of the top quality concerns raised by players. To check the performance of their systems, development teams tend to rely on profiling and monitoring tools, which observe program executions to identify regressions. The usage of static analysis tools for this purpose has been so far limited. Lately, the success of Large Language Models in many code analytics tools led to attempts to leverage them in static performance analysis. These studies showed promising results in predicting runtime and regressions on large public datasets. In this paper, we evaluate the usability of such models in practice, and particularly in the domain of video games. We train a state-of-the-art neural network on the Code4Bench dataset to predict runtime regressions for programming competition programs, then evaluate its ability to generalize to new domains. Our results show that these models achieve great results (e.g. 95.73% accuracy for performance comparison) on the original domain for programs solving in-sample programming tasks, yet fail to generalize to out-of-sample tasks. Furthermore, we show that transfer techniques such as domain adversarial adaptation and model fine-tuning are not sufficient to transfer these models to the target industrial domain of AAA games.
来自源代码的性能预测是特定于任务和领域的
性能是软件系统成功和采用的关键。在电子游戏中,性能通常是玩家最关心的质量问题之一。为了检查他们的系统的性能,开发团队倾向于依赖分析和监视工具,它们观察程序的执行以识别回归。到目前为止,用于此目的的静态分析工具的使用是有限的。最近,大型语言模型在许多代码分析工具中的成功导致了在静态性能分析中利用它们的尝试。这些研究在预测大型公共数据集的运行时间和回归方面显示了有希望的结果。在本文中,我们评估了这些模型在实践中的可用性,特别是在电子游戏领域。我们在Code4Bench数据集上训练了一个最先进的神经网络,以预测编程竞赛程序的运行时回归,然后评估其泛化到新领域的能力。我们的研究结果表明,这些模型在解决样本内编程任务的程序的原始域上取得了很好的结果(例如,性能比较的准确率为95.73%),但无法推广到样本外任务。此外,我们还表明,领域对抗性适应和模型微调等转移技术不足以将这些模型转移到AAA游戏的目标工业领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信