Transfer Learning for Mining Feature Requests and Bug Reports from Tweets and App Store Reviews

Pablo Restrepo Henao, Jannik Fischbach, Dominik Spies, Julian Frattini, Andreas Vogelsang
{"title":"Transfer Learning for Mining Feature Requests and Bug Reports from Tweets and App Store Reviews","authors":"Pablo Restrepo Henao, Jannik Fischbach, Dominik Spies, Julian Frattini, Andreas Vogelsang","doi":"10.1109/REW53955.2021.00019","DOIUrl":null,"url":null,"abstract":"Identifying feature requests and bug reports in user comments holds great potential for development teams. However, automated mining of RE-related information from social media and app stores is challenging since (1) about 70% of user comments contain noisy, irrelevant information, (2) the amount of user comments grows daily making manual analysis unfeasible, and (3) user comments are written in different languages. Existing approaches build on traditional machine learning (ML) and deep learning (DL), but fail to detect feature requests and bug reports with high Recall and acceptable Precision which is necessary for this task. In this paper, we investigate the potential of transfer learning (TL) for the classification of user comments. Specifically, we train both monolingual and multilingual BERT models and compare the performance with state-of-the-art methods. We found that monolingual BERT models outperform existing baseline methods in the classification of English App Reviews as well as English and Italian Tweets. However, we also observed that the application of heavyweight TL models does not necessarily lead to better performance. In fact, our multilingual BERT models perform worse than traditional ML methods.","PeriodicalId":393646,"journal":{"name":"2021 IEEE 29th International Requirements Engineering Conference Workshops (REW)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Requirements Engineering Conference Workshops (REW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/REW53955.2021.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

Identifying feature requests and bug reports in user comments holds great potential for development teams. However, automated mining of RE-related information from social media and app stores is challenging since (1) about 70% of user comments contain noisy, irrelevant information, (2) the amount of user comments grows daily making manual analysis unfeasible, and (3) user comments are written in different languages. Existing approaches build on traditional machine learning (ML) and deep learning (DL), but fail to detect feature requests and bug reports with high Recall and acceptable Precision which is necessary for this task. In this paper, we investigate the potential of transfer learning (TL) for the classification of user comments. Specifically, we train both monolingual and multilingual BERT models and compare the performance with state-of-the-art methods. We found that monolingual BERT models outperform existing baseline methods in the classification of English App Reviews as well as English and Italian Tweets. However, we also observed that the application of heavyweight TL models does not necessarily lead to better performance. In fact, our multilingual BERT models perform worse than traditional ML methods.
从tweet和App Store评论中挖掘功能请求和Bug报告的迁移学习
在用户评论中识别特性请求和bug报告对开发团队来说具有很大的潜力。然而,从社交媒体和应用商店中自动挖掘re相关信息是具有挑战性的,因为(1)大约70%的用户评论包含嘈杂的、不相关的信息,(2)用户评论的数量每天都在增长,使得人工分析变得不可行的,(3)用户评论用不同的语言编写。现有的方法建立在传统的机器学习(ML)和深度学习(DL)的基础上,但无法以高召回率和可接受的精度检测功能请求和错误报告,而这是这项任务所必需的。在本文中,我们研究了迁移学习(TL)在用户评论分类中的潜力。具体来说,我们训练了单语言和多语言BERT模型,并将其性能与最先进的方法进行了比较。我们发现单语言BERT模型在英语应用评论以及英语和意大利语推文分类方面优于现有的基线方法。然而,我们也观察到重量级TL模型的应用并不一定会带来更好的性能。事实上,我们的多语言BERT模型比传统的ML方法表现得更差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信