{"title":"论代码语言模型在科学计算程序中的适用性","authors":"Qianhui Zhao;Fang Liu;Xiao Long;Chengru Wu;Li Zhang","doi":"10.1109/TSE.2025.3564599","DOIUrl":null,"url":null,"abstract":"Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 6","pages":"1685-1701"},"PeriodicalIF":6.5000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the Applicability of Code Language Models to Scientific Computing Programs\",\"authors\":\"Qianhui Zhao;Fang Liu;Xiao Long;Chengru Wu;Li Zhang\",\"doi\":\"10.1109/TSE.2025.3564599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 6\",\"pages\":\"1685-1701\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10977820/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10977820/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
On the Applicability of Code Language Models to Scientific Computing Programs
Scientific Computing Programming Languages (SCPLs), like MATLAB and R, are popular and widely used for computational mathematics. In recent years, pre-trained code language models (CLMs) have automated many code-related tasks, covering various general programming languages. SCPLs share many similarities with general programming languages, including similar syntactic structures and the semantics of identifiers. Despite the similarities, there exist many differences between them. For example, lots of numerical operations and dedicated libraries exist in SCPLs. However, there has been little comprehensive work analyzing CLMs’ capabilities in the understanding and generation of pragmatic scientific computing programs. To this end, we investigate the applicability of code language models for the SCPL analysis, especially focusing on real-world code in open-source repositories. We first create a benchmark that contains programs and documentation from three widely used scientific computing programming languages, then perform an adequate evaluation of existing advanced code language models on both code understanding and generation tasks using the new benchmark, and study the relations of different training strategies, model types, and model sizes to the performance of different tasks and languages. Evaluation results confirm that, compared to general programming languages, SCPLs are more challenging to understand, and especially to generate, but the use of code language models is nevertheless feasible, and the knowledge obtained from the general languages can be transferred to SCPL analysis. A deeper analysis reveals additional challenges in generating code that incorporates API calls relevant to computational mathematics. We believe that our findings can provide guidance on improving tooling and analyses for the scientific programming languages, and also inspire and motivate researchers to improve the robustness of existing code language models.
期刊介绍:
IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include:
a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models.
b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects.
c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards.
d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues.
e) System issues: Hardware-software trade-offs.
f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.