论代码语言模型在科学计算程序中的适用性

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING

IEEE Transactions on Software Engineering Pub Date : 2025-04-25 DOI:10.1109/TSE.2025.3564599

Qianhui Zhao;Fang Liu;Xiao Long;Chengru Wu;Li Zhang

{"title":"论代码语言模型在科学计算程序中的适用性","authors":"Qianhui Zhao;Fang Liu;Xiao Long;Chengru Wu;Li Zhang","doi":"10.1109/TSE.2025.3564599","DOIUrl":null,"url":null,"abstract":"Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 6","pages":"1685-1701"},"PeriodicalIF":6.5000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Applicability of Code Language Models to Scientific Computing Programs\",\"authors\":\"Qianhui Zhao;Fang Liu;Xiao Long;Chengru Wu;Li Zhang\",\"doi\":\"10.1109/TSE.2025.3564599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 6\",\"pages\":\"1685-1701\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10977820/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10977820/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

科学计算编程语言（SCPLs），如MATLAB和R，在计算数学中得到了广泛的应用。近年来，预训练的代码语言模型（clm）已经自动化了许多与代码相关的任务，涵盖了各种通用编程语言。scpl与一般编程语言有许多相似之处，包括相似的语法结构和标识符的语义。尽管有相似之处，但它们之间也存在许多不同之处。例如，scpl中存在许多数值运算和专用库。然而，很少有全面的工作分析clm在理解和生成实用科学计算程序方面的能力。为此，我们研究了代码语言模型对SCPL分析的适用性，特别是关注开源存储库中的真实代码。我们首先创建了一个包含三种广泛使用的科学计算编程语言的程序和文档的基准，然后使用新的基准对现有的高级代码语言模型在代码理解和生成任务上进行了充分的评估，并研究了不同训练策略、模型类型和模型大小对不同任务和语言性能的关系。评估结果证实，与通用编程语言相比，SCPL更难以理解，尤其是难以生成，但使用代码语言模型仍然是可行的，并且从通用语言中获得的知识可以转移到SCPL分析中。更深入的分析揭示了在生成包含与计算数学相关的API调用的代码时面临的其他挑战。我们相信我们的发现可以为改进科学编程语言的工具和分析提供指导，也可以启发和激励研究人员改进现有代码语言模型的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On the Applicability of Code Language Models to Scientific Computing Programs

Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.