反向翻译的应用:一种识别模糊软件需求的迁移学习方法

Isha Subedi, Maninder Singh, Vijayalakshmi Ramasamy, G. Walia
{"title":"反向翻译的应用:一种识别模糊软件需求的迁移学习方法","authors":"Isha Subedi, Maninder Singh, Vijayalakshmi Ramasamy, G. Walia","doi":"10.1145/3409334.3452068","DOIUrl":null,"url":null,"abstract":"Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.","PeriodicalId":148741,"journal":{"name":"Proceedings of the 2021 ACM Southeast Conference","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Application of back-translation: a transfer learning approach to identify ambiguous software requirements\",\"authors\":\"Isha Subedi, Maninder Singh, Vijayalakshmi Ramasamy, G. Walia\",\"doi\":\"10.1145/3409334.3452068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.\",\"PeriodicalId\":148741,\"journal\":{\"name\":\"Proceedings of the 2021 ACM Southeast Conference\",\"volume\":\"201 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 ACM Southeast Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3409334.3452068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 ACM Southeast Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409334.3452068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

在需求工程中,模糊的需求是有问题的,因为不同的涉众可能会对需求的解释进行辩论,从而导致开发阶段中的各种问题。由于需求说明通常是用自然语言编写的,分析模棱两可的需求目前是一个手工过程,因为它还没有完全自动化以满足行业标准。在本文中,我们通过使用ULMFiT使用迁移学习,其中我们将模型预训练到通用领域语料库,然后对其进行微调以分类模糊与非模糊的需求(目标任务)。然后,我们将其与机器学习分类器(如SVM、线性回归和多项朴素贝叶斯)的准确性进行了比较。我们还使用反向翻译(BT)作为文本增强技术,看看它是否提高了分类精度。结果表明,对于我们的初始数据集,ULMFiT比SVM(支持向量机)、Logistic回归和多项朴素贝叶斯获得了更高的精度。此外,通过使用BT增强需求,ULMFiT获得了比SVM、Logistic回归和多项朴素贝叶斯分类器更高的准确率,初始性能提高了5.371%。我们提出的研究为如何将迁移学习和文本增强应用于需求工程中的小数据集提供了一些有希望的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Application of back-translation: a transfer learning approach to identify ambiguous software requirements
Ambiguous requirements are problematic in requirement engineering as various stakeholders can debate on the interpretation of the requirements leading to a variety of issues in the development stages. Since requirement specifications are usually written in natural language, analyzing ambiguous requirements is currently a manual process as it has not been fully automated to meet the industry standards. In this paper, we used transfer learning by using ULMFiT where we pre-trained our model to a general-domain corpus and then fine-tuned it to classify ambiguous vs unambiguous requirements (target task). We then compared its accuracy with machine learning classifiers like SVM, Linear Regression, and Multinomial Naive Bayes. We also used back translation (BT) as a text augmentation technique to see if it improved the classification accuracy. Our results showed that ULMFiT achieved higher accuracy than SVM (Support Vector Machines), Logistic Regression and Multinomial Naive Bayes for our initial data set. Further by augmenting requirements using BT, ULMFiT got a higher accuracy than SVM, Logistic Regression, and Multinomial Naive Bayes classifier, improving the initial performance by 5.371%. Our proposed research provides some promising insights on how transfer learning and text augmentation can be applied to small data sets in requirements engineering.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信