关于恢复从需求到代码的可追溯性链接的最新方法的实证研究

IF 5.2 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Bangchao Wang , Zhiyuan Zou , Hongyan Wan , Yuanbang Li , Yang Deng , Xingfu Li
{"title":"关于恢复从需求到代码的可追溯性链接的最新方法的实证研究","authors":"Bangchao Wang ,&nbsp;Zhiyuan Zou ,&nbsp;Hongyan Wan ,&nbsp;Yuanbang Li ,&nbsp;Yang Deng ,&nbsp;Xingfu Li","doi":"10.1016/j.jksuci.2024.102118","DOIUrl":null,"url":null,"abstract":"<div><p>Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002076/pdfft?md5=8ae41f972f5fcb180b95390116c548b9&pid=1-s2.0-S1319157824002076-main.pdf","citationCount":"0","resultStr":"{\"title\":\"An empirical study on the state-of-the-art methods for requirement-to-code traceability link recovery\",\"authors\":\"Bangchao Wang ,&nbsp;Zhiyuan Zou ,&nbsp;Hongyan Wan ,&nbsp;Yuanbang Li ,&nbsp;Yang Deng ,&nbsp;Xingfu Li\",\"doi\":\"10.1016/j.jksuci.2024.102118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.</p></div>\",\"PeriodicalId\":48547,\"journal\":{\"name\":\"Journal of King Saud University-Computer and Information Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002076/pdfft?md5=8ae41f972f5fcb180b95390116c548b9&pid=1-s2.0-S1319157824002076-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of King Saud University-Computer and Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1319157824002076\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824002076","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

从需求到代码的可追溯性链接恢复(RC-TLR)可以在需求和目标代码工件之间建立联系,这对大型软件系统的维护和演进至关重要。然而,据我们所知,目前还没有针对 RC-TLR 问题的最先进(SOTA)方法的实验研究,也缺乏评估该领域新方法的统一基准。我们利用系统文献综述法开发了一个识别 SOTA 方法的框架,并将其应用于 2018 年至 2023 年 RC-TLR 领域的研究。通过在 13 个数据集上使用 6 种方法进行实验复制,我们观察到,在基于信息检索的方法中,基于目标人工制品之间的密切关系的方法(CRT)、基于共识双术语的可追溯性恢复(TAROT)和细粒度 TLR(FTLR)在 COEST 数据集上表现良好,而将部分语音与信息检索技术相结合的方法(Conpos)和 TAROT 在大型数据集上取得了可喜的成果。至于基于机器学习的方法,随机森林在所有数据集上都表现出了强劲的性能。我们希望这项研究能为 RC-TLR 领域的性能评估提供一个比较基准。我们建立的资源库有望减轻研究人员在性能分析方面的工作量,并促进该领域的进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An empirical study on the state-of-the-art methods for requirement-to-code traceability link recovery

Requirements-to-code traceability link recovery (RC-TLR) can establish connections between requirements and target code artifacts, which is critical for the maintenance and evolution of large software systems. However, to the best of our knowledge, there is no existing experimental study focused on state-of-the-art (SOTA) methods for the RC-TLR problem, and there is also a lack of uniform benchmarks for evaluating new methods in the field. We developed a framework to identify SOTA methods using the Systematic Literature Review method and applied it to research in the RC-TLR field from 2018 to 2023. Through experiments replication on 13 datasets using 6 methods, we observed that for information retrieval-based methods, Close Relations between Target artifacts-based method (CRT), TraceAbility Recovery by Consensual biTerms (TAROT), and Fine-grained TLR (FTLR) performed well on COEST dataset, while Combining Part-Of-Speech with information-retrieval techniques (Conpos) and TAROT achieve promising results in large datasets. As concerns machine learning-based methods, Random Forest consistently exhibits strong performances on all datasets. We hope that this study can provide a comparative benchmark for performance evaluation in the RC-TLR field. The resource repository that we have established is expected to alleviate the workload of researchers in performance analysis, and promote progress of the field.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.50
自引率
8.70%
发文量
656
审稿时长
29 days
期刊介绍: In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信