通过集成链接分析和预训练语言模型技术预测代码到测试的共同进化

IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Yuyong Liu;Zhifei Chen;Lin Chen;Yanhui Li;Xuansong Li;Wei Song
{"title":"通过集成链接分析和预训练语言模型技术预测代码到测试的共同进化","authors":"Yuyong Liu;Zhifei Chen;Lin Chen;Yanhui Li;Xuansong Li;Wei Song","doi":"10.1109/TSE.2025.3583027","DOIUrl":null,"url":null,"abstract":"Tests, as an essential artifact, should co-evolve with the production code to ensure that the associated production code satisfies specification. However, developers often postpone or even forget to update tests, making the tests outdated and lag behind the code. To predict which tests need to be updated when production code is changed, it is challenging to identify all related tests and determine their change probabilities due to complex change scenarios. This paper fills the gap and proposes a hybrid approach named COTE to predict code-to-test co-evolution. We first compute the linked test candidates based on different code-to-test dependencies. After that, we identify common co-change patterns by building a method-level dependence graph. For the remaining ambiguous patterns, we leverage a pre-trained language model which captures the semantic features of code and the change reasons contained in commit messages to judge one test’s likelihood of being updated. Experiments on our datasets consisting of 6,314 samples extracted from 5,000 Java projects show that COTE outperforms state-of-the-art approaches, achieving a precision of 89.0% and a recall of 71.6%. This work can help practitioners reduce test maintenance costs and improve software quality.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 8","pages":"2232-2253"},"PeriodicalIF":5.6000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"COTE: Predicting Code-to-Test Co-Evolution by Integrating Link Analysis and Pre-Trained Language Model Techniques\",\"authors\":\"Yuyong Liu;Zhifei Chen;Lin Chen;Yanhui Li;Xuansong Li;Wei Song\",\"doi\":\"10.1109/TSE.2025.3583027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tests, as an essential artifact, should co-evolve with the production code to ensure that the associated production code satisfies specification. However, developers often postpone or even forget to update tests, making the tests outdated and lag behind the code. To predict which tests need to be updated when production code is changed, it is challenging to identify all related tests and determine their change probabilities due to complex change scenarios. This paper fills the gap and proposes a hybrid approach named COTE to predict code-to-test co-evolution. We first compute the linked test candidates based on different code-to-test dependencies. After that, we identify common co-change patterns by building a method-level dependence graph. For the remaining ambiguous patterns, we leverage a pre-trained language model which captures the semantic features of code and the change reasons contained in commit messages to judge one test’s likelihood of being updated. Experiments on our datasets consisting of 6,314 samples extracted from 5,000 Java projects show that COTE outperforms state-of-the-art approaches, achieving a precision of 89.0% and a recall of 71.6%. This work can help practitioners reduce test maintenance costs and improve software quality.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 8\",\"pages\":\"2232-2253\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11053682/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11053682/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

测试,作为一个重要的工件,应该与生产代码共同发展,以确保相关的生产代码满足规范。然而,开发人员经常推迟甚至忘记更新测试,使测试过时并落后于代码。为了预测在产品代码更改时需要更新哪些测试,识别所有相关的测试并确定由于复杂的更改场景而导致的更改概率是具有挑战性的。本文填补了这一空白,并提出了一种名为COTE的混合方法来预测代码到测试的共同进化。我们首先根据不同的代码到测试依赖关系计算链接的测试候选项。之后,我们通过构建方法级依赖图来识别常见的共变更模式。对于剩余的模糊模式,我们利用一个预先训练好的语言模型,该模型捕获代码的语义特征和提交消息中包含的更改原因,以判断一个测试被更新的可能性。在我们从5000个Java项目中提取的6314个样本的数据集上进行的实验表明,COTE优于最先进的方法,达到了89.0%的精度和71.6%的召回率。这项工作可以帮助实践者减少测试维护成本并提高软件质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
COTE: Predicting Code-to-Test Co-Evolution by Integrating Link Analysis and Pre-Trained Language Model Techniques
Tests, as an essential artifact, should co-evolve with the production code to ensure that the associated production code satisfies specification. However, developers often postpone or even forget to update tests, making the tests outdated and lag behind the code. To predict which tests need to be updated when production code is changed, it is challenging to identify all related tests and determine their change probabilities due to complex change scenarios. This paper fills the gap and proposes a hybrid approach named COTE to predict code-to-test co-evolution. We first compute the linked test candidates based on different code-to-test dependencies. After that, we identify common co-change patterns by building a method-level dependence graph. For the remaining ambiguous patterns, we leverage a pre-trained language model which captures the semantic features of code and the change reasons contained in commit messages to judge one test’s likelihood of being updated. Experiments on our datasets consisting of 6,314 samples extracted from 5,000 Java projects show that COTE outperforms state-of-the-art approaches, achieving a precision of 89.0% and a recall of 71.6%. This work can help practitioners reduce test maintenance costs and improve software quality.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering 工程技术-工程:电子与电气
CiteScore
9.70
自引率
10.80%
发文量
724
审稿时长
6 months
期刊介绍: IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信