基于信息检索的需求跟踪方法的系统映射研究

IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Hongyan Wan , Xinyu He , Yang Deng , Bangchao Wang
{"title":"基于信息检索的需求跟踪方法的系统映射研究","authors":"Hongyan Wan ,&nbsp;Xinyu He ,&nbsp;Yang Deng ,&nbsp;Bangchao Wang","doi":"10.1016/j.ipm.2025.104287","DOIUrl":null,"url":null,"abstract":"<div><div>Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104287"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A systematic mapping study of information retrieval-based requirements traceability methods\",\"authors\":\"Hongyan Wan ,&nbsp;Xinyu He ,&nbsp;Yang Deng ,&nbsp;Bangchao Wang\",\"doi\":\"10.1016/j.ipm.2025.104287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 6\",\"pages\":\"Article 104287\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325002286\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002286","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

需求可追溯性(RT)对于确保软件开发中的一致性、质量和可维护性至关重要。在基于学习的信息检索方法日益受到重视的同时,传统的信息检索方法在实践中仍被广泛使用。然而,现有文献缺乏对其最佳实践和最新进展的系统综合。为了解决这一差距,我们对2014年至2024年间发表的40项主要研究进行了系统的地图研究(SMS),这些研究是从2052份出版物的初始库中选出的。我们的综述考察了广泛采用的IR模型、增强策略、评估数据集、性能指标和基线方法。具体来说,我们将32种具有代表性的增强策略分为四种方法类型:(1)人工制品文本信息,(2)人工制品结构信息,(3)基于模型的优化,(4)人为干预。此外,我们分析了53个常用数据集和9个评估指标进行验证。我们的研究结果表明,在各种IR模型中,向量空间模型(VSM)和潜在语义索引(LSI)在RT任务中通常具有更强的性能。这项研究提供了基于ir的RT研究的综合,并为软件工程中的可追溯性提供了实际的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A systematic mapping study of information retrieval-based requirements traceability methods
Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信