使用动态和静态技术在Python项目的生产代码和测试代码之间建立可追溯性链接：复制研究

IF 1.8 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Journal of Software-Evolution and Process Pub Date : 2025-03-11 DOI:10.1002/smr.70011

Zhifei Chen, Chiheng Jia, Yanhui Li, Lin Chen

{"title":"使用动态和静态技术在Python项目的生产代码和测试代码之间建立可追溯性链接：复制研究","authors":"Zhifei Chen, Chiheng Jia, Yanhui Li, Lin Chen","doi":"10.1002/smr.70011","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>The relationship between test code and production code, that is, test-to-code traceability, plays an essential role in the verification, reliability, and certification of software systems. Prior work on test-to-code traceability focuses mainly on Java. However, as Python allows more flexible testing styles, it is still unknown whether existing traceability approaches work well on Python projects. In order to address this gap in knowledge, this paper evaluates whether existing traceability approaches can accurately identify test-to-code links in Python projects. We collected seven popular Python projects and carried out an exploratory study at both the method and module levels (involving a total of 3198 test cases). On these projects, we evaluated 15 individual traceability techniques along with cross-level information propagation and four combining resolution strategies. The results reveal that the performance of test-to-code traceability approaches on Python has many differences with Java: (1) most of the existing techniques have poor effectiveness for Python; (2) after augmenting with cross-level information, the recall surprisingly drops; and (3) machine learning based combination approach achieves the best recall but the worst precision. These findings shed light on the best traceability approaches for Python projects, and also provide guidelines for researchers and the Python community.</p>\n </div>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"37 3","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Dynamic and Static Techniques to Establish Traceability Links Between Production Code and Test Code on Python Projects: A Replication Study\",\"authors\":\"Zhifei Chen, Chiheng Jia, Yanhui Li, Lin Chen\",\"doi\":\"10.1002/smr.70011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>The relationship between test code and production code, that is, test-to-code traceability, plays an essential role in the verification, reliability, and certification of software systems. Prior work on test-to-code traceability focuses mainly on Java. However, as Python allows more flexible testing styles, it is still unknown whether existing traceability approaches work well on Python projects. In order to address this gap in knowledge, this paper evaluates whether existing traceability approaches can accurately identify test-to-code links in Python projects. We collected seven popular Python projects and carried out an exploratory study at both the method and module levels (involving a total of 3198 test cases). On these projects, we evaluated 15 individual traceability techniques along with cross-level information propagation and four combining resolution strategies. The results reveal that the performance of test-to-code traceability approaches on Python has many differences with Java: (1) most of the existing techniques have poor effectiveness for Python; (2) after augmenting with cross-level information, the recall surprisingly drops; and (3) machine learning based combination approach achieves the best recall but the worst precision. These findings shed light on the best traceability approaches for Python projects, and also provide guidelines for researchers and the Python community.</p>\\n </div>\",\"PeriodicalId\":48898,\"journal\":{\"name\":\"Journal of Software-Evolution and Process\",\"volume\":\"37 3\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Software-Evolution and Process\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/smr.70011\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.70011","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

测试代码和生产代码之间的关系，即测试到代码的可追溯性，在软件系统的验证、可靠性和认证中起着至关重要的作用。之前关于测试到代码可追溯性的工作主要集中在Java上。然而，由于Python允许更灵活的测试风格，现有的可追溯性方法是否适用于Python项目仍然是未知的。为了解决这方面的知识差距，本文评估了现有的可追溯性方法是否可以准确地识别Python项目中的测试到代码链接。我们收集了7个流行的Python项目，并在方法和模块级别进行了探索性研究（总共涉及3198个测试用例）。在这些项目中，我们评估了15种单独的可追溯性技术，以及跨层信息传播和4种组合解决策略。结果表明，Python上测试到代码跟踪方法的性能与Java存在许多差异：(1)大多数现有技术对Python的有效性较差；(2)跨层信息增强后，召回率显著下降；(3)基于机器学习的组合方法获得了最好的召回率，但精度最差。这些发现揭示了Python项目的最佳可追溯性方法，也为研究人员和Python社区提供了指导方针。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Using Dynamic and Static Techniques to Establish Traceability Links Between Production Code and Test Code on Python Projects: A Replication Study

查看原文本刊更多论文

Using Dynamic and Static Techniques to Establish Traceability Links Between Production Code and Test Code on Python Projects: A Replication Study

The relationship between test code and production code, that is, test-to-code traceability, plays an essential role in the verification, reliability, and certification of software systems. Prior work on test-to-code traceability focuses mainly on Java. However, as Python allows more flexible testing styles, it is still unknown whether existing traceability approaches work well on Python projects. In order to address this gap in knowledge, this paper evaluates whether existing traceability approaches can accurately identify test-to-code links in Python projects. We collected seven popular Python projects and carried out an exploratory study at both the method and module levels (involving a total of 3198 test cases). On these projects, we evaluated 15 individual traceability techniques along with cross-level information propagation and four combining resolution strategies. The results reveal that the performance of test-to-code traceability approaches on Python has many differences with Java: (1) most of the existing techniques have poor effectiveness for Python; (2) after augmenting with cross-level information, the recall surprisingly drops; and (3) machine learning based combination approach achieves the best recall but the worst precision. These findings shed light on the best traceability approaches for Python projects, and also provide guidelines for researchers and the Python community.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Software-Evolution and Process COMPUTER SCIENCE, SOFTWARE ENGINEERING-

自引率

10.00%

发文量

109