基于深度学习的小数据集环境下在线反馈用户需求分类

R. Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, Mikael Lindvall
{"title":"基于深度学习的小数据集环境下在线反馈用户需求分类","authors":"R. Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, Mikael Lindvall","doi":"10.1109/RE51729.2021.00020","DOIUrl":null,"url":null,"abstract":"An overwhelming number of users access app repositories like App Store/Google Play and social media platforms like Twitter, where they provide feedback on digital experiences. This vast textual corpus comprising user feedback has the potential to unearth detailed insights regarding the users’ opinions on products and services. Various tools have been proposed that employ natural language processing (NLP) and traditional machine learning (ML) based models as an inexpensive mechanism to identify requirements in user feedback. However, they fall short on their classification accuracy over unseen data due to factors like the cost of generating voluminous de-biased labeled datasets and general inefficiency. Recently, Van Vliet et al. [1] achieved state-of-the-art results extracting and classifying requirements from user reviews through traditional crowdsourcing. Based on their reference classification tasks and outcomes, we successfully developed and validated a deep-learning-backed artificial intelligence pipeline to achieve a state-of-the-art averaged classification accuracy of ∼87% on standard tasks for user feedback analysis. This approach, which comprises a BERT-based sequence classifier, proved effective even in extremely low-volume dataset environments. Additionally, our approach drastically reduces the time and costs of evaluation, and improves on the accuracy measures achieved using traditional ML-/NLP-based techniques.","PeriodicalId":440285,"journal":{"name":"2021 IEEE 29th International Requirements Engineering Conference (RE)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning\",\"authors\":\"R. Mekala, Asif Irfan, Eduard C. Groen, Adam Porter, Mikael Lindvall\",\"doi\":\"10.1109/RE51729.2021.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An overwhelming number of users access app repositories like App Store/Google Play and social media platforms like Twitter, where they provide feedback on digital experiences. This vast textual corpus comprising user feedback has the potential to unearth detailed insights regarding the users’ opinions on products and services. Various tools have been proposed that employ natural language processing (NLP) and traditional machine learning (ML) based models as an inexpensive mechanism to identify requirements in user feedback. However, they fall short on their classification accuracy over unseen data due to factors like the cost of generating voluminous de-biased labeled datasets and general inefficiency. Recently, Van Vliet et al. [1] achieved state-of-the-art results extracting and classifying requirements from user reviews through traditional crowdsourcing. Based on their reference classification tasks and outcomes, we successfully developed and validated a deep-learning-backed artificial intelligence pipeline to achieve a state-of-the-art averaged classification accuracy of ∼87% on standard tasks for user feedback analysis. This approach, which comprises a BERT-based sequence classifier, proved effective even in extremely low-volume dataset environments. Additionally, our approach drastically reduces the time and costs of evaluation, and improves on the accuracy measures achieved using traditional ML-/NLP-based techniques.\",\"PeriodicalId\":440285,\"journal\":{\"name\":\"2021 IEEE 29th International Requirements Engineering Conference (RE)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 29th International Requirements Engineering Conference (RE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RE51729.2021.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Requirements Engineering Conference (RE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RE51729.2021.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

大量用户访问app Store/ b谷歌Play等应用库和Twitter等社交媒体平台,并在这些平台上提供数字体验反馈。这个包含用户反馈的庞大文本语料库有可能挖掘出有关用户对产品和服务的意见的详细见解。已经提出了各种工具,它们采用自然语言处理(NLP)和传统的基于机器学习(ML)的模型作为一种廉价的机制来识别用户反馈中的需求。然而,由于产生大量去偏见标记数据集的成本和普遍的低效率等因素,它们在对未见过的数据的分类准确性方面存在不足。最近,Van Vliet等人([1])通过传统的众包方法从用户评论中提取需求并进行分类,取得了最先进的结果。基于他们的参考分类任务和结果,我们成功地开发并验证了一个深度学习支持的人工智能管道,在用于用户反馈分析的标准任务上实现了最先进的平均分类准确率约87%。这种方法包括基于bert的序列分类器,即使在极低容量的数据集环境中也证明是有效的。此外,我们的方法大大减少了评估的时间和成本,并提高了使用传统的基于ML / nlp的技术所达到的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Classifying User Requirements from Online Feedback in Small Dataset Environments using Deep Learning
An overwhelming number of users access app repositories like App Store/Google Play and social media platforms like Twitter, where they provide feedback on digital experiences. This vast textual corpus comprising user feedback has the potential to unearth detailed insights regarding the users’ opinions on products and services. Various tools have been proposed that employ natural language processing (NLP) and traditional machine learning (ML) based models as an inexpensive mechanism to identify requirements in user feedback. However, they fall short on their classification accuracy over unseen data due to factors like the cost of generating voluminous de-biased labeled datasets and general inefficiency. Recently, Van Vliet et al. [1] achieved state-of-the-art results extracting and classifying requirements from user reviews through traditional crowdsourcing. Based on their reference classification tasks and outcomes, we successfully developed and validated a deep-learning-backed artificial intelligence pipeline to achieve a state-of-the-art averaged classification accuracy of ∼87% on standard tasks for user feedback analysis. This approach, which comprises a BERT-based sequence classifier, proved effective even in extremely low-volume dataset environments. Additionally, our approach drastically reduces the time and costs of evaluation, and improves on the accuracy measures achieved using traditional ML-/NLP-based techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信