Hongyan Wan , Xinyu He , Yang Deng , Bangchao Wang
{"title":"基于信息检索的需求跟踪方法的系统映射研究","authors":"Hongyan Wan , Xinyu He , Yang Deng , Bangchao Wang","doi":"10.1016/j.ipm.2025.104287","DOIUrl":null,"url":null,"abstract":"<div><div>Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104287"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A systematic mapping study of information retrieval-based requirements traceability methods\",\"authors\":\"Hongyan Wan , Xinyu He , Yang Deng , Bangchao Wang\",\"doi\":\"10.1016/j.ipm.2025.104287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 6\",\"pages\":\"Article 104287\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325002286\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002286","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
A systematic mapping study of information retrieval-based requirements traceability methods
Requirements traceability (RT) is critical for ensuring consistency, quality, and maintainability in software development. While learning-based approaches have gained increasing attention, traditional information retrieval (IR) methods remain widely used in practice. However, existing literature lacks a systematic synthesis of their best practices and recent advancements. To address this gap, we conducted a systematic mapping study (SMS) of 40 primary studies published between 2014 and 2024, selected from an initial pool of 2,052 publications. Our review examines widely adopted IR models, enhancement strategies, evaluation datasets, performance metrics, and baseline methods. Specifically, we identify and categorize 32 representative enhancement strategies into four methodological types: (1) artifact text information, (2) artifact structural information, (3) model-based optimization, and (4) human intervention. Furthermore, we analyze 53 commonly used datasets and 9 evaluation metrics for validation. Our findings indicate that among various IR models, the Vector Space Model (VSM) and Latent Semantic Indexing (LSI) typically achieve stronger performance in RT tasks. This study provides a comprehensive synthesis of IR-based RT research and offers practical insights to advance traceability in software engineering.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.