A Machine Learning Approach for Semi-Automated Search and Selection in Literature Studies

R. Ros, E. Bjarnason, P. Runeson
{"title":"A Machine Learning Approach for Semi-Automated Search and Selection in Literature Studies","authors":"R. Ros, E. Bjarnason, P. Runeson","doi":"10.1145/3084226.3084243","DOIUrl":null,"url":null,"abstract":"Background. Search and selection of primary studies in Systematic Literature Reviews (SLR) is labour intensive, and hard to replicate and update. Aims. We explore a machine learning approach to support semi-automated search and selection in SLRs to address these weaknesses. Method. We 1) train a classifier on an initial set of papers, 2) extend this set of papers by automated search and snowballing, 3) have the researcher validate the top paper, selected by the classifier, and 4) update the set of papers and iterate the process until a stopping criterion is met. Results. We demonstrate with a proof-of-concept tool that the proposed automated search and selection approach generates valid search strings and that the performance for subsets of primary studies can reduce the manual work by half. Conclusions. The approach is promising and the demonstrated advantages include cost savings and replicability. The next steps include further tool development and evaluate the approach on a complete SLR.","PeriodicalId":192290,"journal":{"name":"Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3084226.3084243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40

Abstract

Background. Search and selection of primary studies in Systematic Literature Reviews (SLR) is labour intensive, and hard to replicate and update. Aims. We explore a machine learning approach to support semi-automated search and selection in SLRs to address these weaknesses. Method. We 1) train a classifier on an initial set of papers, 2) extend this set of papers by automated search and snowballing, 3) have the researcher validate the top paper, selected by the classifier, and 4) update the set of papers and iterate the process until a stopping criterion is met. Results. We demonstrate with a proof-of-concept tool that the proposed automated search and selection approach generates valid search strings and that the performance for subsets of primary studies can reduce the manual work by half. Conclusions. The approach is promising and the demonstrated advantages include cost savings and replicability. The next steps include further tool development and evaluate the approach on a complete SLR.
文献研究中半自动检索与选择的机器学习方法
背景。在系统文献综述(SLR)中搜索和选择主要研究是劳动密集型的,并且很难复制和更新。目标我们探索了一种机器学习方法来支持单反中的半自动搜索和选择,以解决这些弱点。方法。我们1)在一组初始的论文上训练一个分类器,2)通过自动搜索和滚雪球来扩展这组论文,3)让研究人员验证分类器选择的最重要的论文,4)更新这组论文并迭代这个过程,直到满足停止标准。结果。我们用一个概念验证工具证明了所提出的自动搜索和选择方法可以生成有效的搜索字符串,并且主要研究子集的性能可以减少一半的手动工作。结论。这种方法很有前途,其优点包括节省成本和可复制性。接下来的步骤包括进一步的工具开发,并在完整的单反上评估该方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信