使用语言模型从python代码自动化软件大小测量

IF 3.1 2区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Automated Software Engineering Pub Date : 2025-10-18 DOI:10.1007/s10515-025-00571-z

Samet Tenekeci, Hüseyin Ünlü, Bedir Arda Gül, Damla Keleş, Murat Küük, Onur Demirörs

{"title":"使用语言模型从python代码自动化软件大小测量","authors":"Samet Tenekeci, Hüseyin Ünlü, Bedir Arda Gül, Damla Keleş, Murat Küük, Onur Demirörs","doi":"10.1007/s10515-025-00571-z","DOIUrl":null,"url":null,"abstract":"<div><p>Software size is a key input for project planning, effort estimation, and productivity analysis. While pre-trained language models have shown promise in deriving functional size from natural-language requirements, measuring size directly from source code remains under-explored. Yet, code-based size measurement is critical in modern workflows where requirement documents are often incomplete or unavailable, especially in Agile development environments. This exploratory study investigates the use of CodeBERT, a pre-trained bimodal transformer model, for measuring software size directly from Python source code according to two measurement methods: COSMIC Function Points and MicroM. We construct two curated datasets from the Python subset of the CodeSearchNet corpus, and manually annotate each function with its corresponding size. Our experimental results show that CodeBERT can successfully measure COSMIC data movements with up to 91.4% accuracy and generalize to the functional, architectural, and algorithmic event types defined in MicroM, reaching up to 81.5% accuracy. These findings highlight the potential of code-based language models for automated functional size measurement when requirement artifacts are absent or unreliable.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"33 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automating software size measurement from python code using language models\",\"authors\":\"Samet Tenekeci, Hüseyin Ünlü, Bedir Arda Gül, Damla Keleş, Murat Küük, Onur Demirörs\",\"doi\":\"10.1007/s10515-025-00571-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Software size is a key input for project planning, effort estimation, and productivity analysis. While pre-trained language models have shown promise in deriving functional size from natural-language requirements, measuring size directly from source code remains under-explored. Yet, code-based size measurement is critical in modern workflows where requirement documents are often incomplete or unavailable, especially in Agile development environments. This exploratory study investigates the use of CodeBERT, a pre-trained bimodal transformer model, for measuring software size directly from Python source code according to two measurement methods: COSMIC Function Points and MicroM. We construct two curated datasets from the Python subset of the CodeSearchNet corpus, and manually annotate each function with its corresponding size. Our experimental results show that CodeBERT can successfully measure COSMIC data movements with up to 91.4% accuracy and generalize to the functional, architectural, and algorithmic event types defined in MicroM, reaching up to 81.5% accuracy. These findings highlight the potential of code-based language models for automated functional size measurement when requirement artifacts are absent or unreliable.</p></div>\",\"PeriodicalId\":55414,\"journal\":{\"name\":\"Automated Software Engineering\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10515-025-00571-z\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-025-00571-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

软件大小是项目计划、工作量估计和生产力分析的关键输入。虽然预训练的语言模型在从自然语言需求中获得功能大小方面显示出了希望，但直接从源代码中测量大小仍然有待探索。然而，在需求文档经常不完整或不可用的现代工作流中，尤其是在敏捷开发环境中，基于代码的大小度量是至关重要的。本探索性研究调查了CodeBERT的使用，CodeBERT是一种预训练的双峰变压器模型，用于根据两种测量方法直接从Python源代码测量软件大小：COSMIC功能点和MicroM。我们从CodeSearchNet语料库的Python子集中构建了两个精心策划的数据集，并用相应的大小手动注释每个函数。我们的实验结果表明，CodeBERT可以成功地测量COSMIC数据移动，准确率高达91.4%，并且可以推广到MicroM中定义的功能、架构和算法事件类型，准确率高达81.5%。这些发现突出了在需求工件缺失或不可靠的情况下，基于代码的语言模型用于自动化功能大小度量的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automating software size measurement from python code using language models

Software size is a key input for project planning, effort estimation, and productivity analysis. While pre-trained language models have shown promise in deriving functional size from natural-language requirements, measuring size directly from source code remains under-explored. Yet, code-based size measurement is critical in modern workflows where requirement documents are often incomplete or unavailable, especially in Agile development environments. This exploratory study investigates the use of CodeBERT, a pre-trained bimodal transformer model, for measuring software size directly from Python source code according to two measurement methods: COSMIC Function Points and MicroM. We construct two curated datasets from the Python subset of the CodeSearchNet corpus, and manually annotate each function with its corresponding size. Our experimental results show that CodeBERT can successfully measure COSMIC data movements with up to 91.4% accuracy and generalize to the functional, architectural, and algorithmic event types defined in MicroM, reaching up to 81.5% accuracy. These findings highlight the potential of code-based language models for automated functional size measurement when requirement artifacts are absent or unreliable.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Automated Software Engineering 工程技术-计算机：软件工程

CiteScore

4.80

自引率

11.80%

发文量

审稿时长

>12 weeks

期刊介绍： This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes. Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.