STRICT: Information retrieval based search term identification for concept location

M. M. Rahman, C. Roy
{"title":"STRICT: Information retrieval based search term identification for concept location","authors":"M. M. Rahman, C. Roy","doi":"10.1109/SANER.2017.7884611","DOIUrl":null,"url":null,"abstract":"During maintenance, software developers deal with numerous change requests that are written in an unstructured fashion using natural language. Such natural language texts illustrate the change requirement involving various domain related concepts. Software developers need to find appropriate search terms from those concepts so that they could locate the possible locations in the source code using a search technique. Once such locations are identified, they can implement the requested changes there. Studies suggest that developers often perform poorly in coming up with good search terms for a change task. In this paper, we propose a novel technique-STRICT-that automatically identifies suitable search terms for a software change task by analyzing its task description using two information retrieval (IR) techniques-TextRank and POSRank. These IR techniques determine a term's importance based on not only its co-occurrences with other important terms but also its syntactic relationships with them. Experiments using 1,939 change requests from eight subject systems report that STRICT can identify better quality search terms than baseline terms from 52%–62% of the requests with 30%–57% Top-10 retrieval accuracy which are promising. Comparison with two state-of-the-art techniques not only validates our empirical findings and but also demonstrates the superiority of our technique.","PeriodicalId":6541,"journal":{"name":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"42 1","pages":"79-90"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SANER.2017.7884611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

Abstract

During maintenance, software developers deal with numerous change requests that are written in an unstructured fashion using natural language. Such natural language texts illustrate the change requirement involving various domain related concepts. Software developers need to find appropriate search terms from those concepts so that they could locate the possible locations in the source code using a search technique. Once such locations are identified, they can implement the requested changes there. Studies suggest that developers often perform poorly in coming up with good search terms for a change task. In this paper, we propose a novel technique-STRICT-that automatically identifies suitable search terms for a software change task by analyzing its task description using two information retrieval (IR) techniques-TextRank and POSRank. These IR techniques determine a term's importance based on not only its co-occurrences with other important terms but also its syntactic relationships with them. Experiments using 1,939 change requests from eight subject systems report that STRICT can identify better quality search terms than baseline terms from 52%–62% of the requests with 30%–57% Top-10 retrieval accuracy which are promising. Comparison with two state-of-the-art techniques not only validates our empirical findings and but also demonstrates the superiority of our technique.
严格:基于信息检索的概念位置搜索词识别
在维护期间,软件开发人员处理使用自然语言以非结构化方式编写的大量变更请求。这些自然语言文本说明了涉及各种领域相关概念的变更需求。软件开发人员需要从这些概念中找到合适的搜索词,以便他们可以使用搜索技术在源代码中定位可能的位置。一旦确定了这些位置,他们就可以在那里实现所请求的更改。研究表明,开发人员在为变更任务提供合适的搜索条件方面通常表现不佳。在本文中,我们提出了一种新的技术- strict -通过使用两种信息检索技术(textrank和POSRank)分析任务描述来自动识别适合软件变更任务的搜索词。这些IR技术不仅根据术语与其他重要术语的共现情况,而且根据它们之间的句法关系来确定术语的重要性。使用来自8个主题系统的1939个更改请求的实验报告表明,STRICT可以从52%-62%的请求中识别出比基线更高质量的搜索词,前10名的检索准确率为30%-57%,这是有希望的。与两种最先进的技术进行比较,不仅验证了我们的实证研究结果,而且证明了我们技术的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信