Bangchao Wang , Zhiyuan Zou , Hongyan Wan , Yuanbang Li , Yang Deng , Xingfu Li
{"title":"关于恢复从需求到代码的可追溯性链接的最新方法的实证研究","authors":"Bangchao Wang , Zhiyuan Zou , Hongyan Wan , Yuanbang Li , Yang Deng , Xingfu Li","doi":"10.1016/j.jksuci.2024.102118","DOIUrl":null,"url":null,"abstract":"<div><p>Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002076/pdfft?md5=8ae41f972f5fcb180b95390116c548b9&pid=1-s2.0-S1319157824002076-main.pdf","citationCount":"0","resultStr":"{\"title\":\"An empirical study on the state-of-the-art methods for requirement-to-code traceability link recovery\",\"authors\":\"Bangchao Wang , Zhiyuan Zou , Hongyan Wan , Yuanbang Li , Yang Deng , Xingfu Li\",\"doi\":\"10.1016/j.jksuci.2024.102118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.</p></div>\",\"PeriodicalId\":48547,\"journal\":{\"name\":\"Journal of King Saud University-Computer and Information Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002076/pdfft?md5=8ae41f972f5fcb180b95390116c548b9&pid=1-s2.0-S1319157824002076-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of King Saud University-Computer and Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002076\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824002076","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
An empirical study on the state-of-the-art methods for requirement-to-code traceability link recovery
Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.
期刊介绍:
In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.