Evaluation of Context-Aware Language Models and Experts for Effort Estimation of Software Maintenance Issues

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2022-10-01 DOI:10.1109/ICSME55016.2022.00020

Mohammed Alhamed, Tim Storer

{"title":"Evaluation of Context-Aware Language Models and Experts for Effort Estimation of Software Maintenance Issues","authors":"Mohammed Alhamed, Tim Storer","doi":"10.1109/ICSME55016.2022.00020","DOIUrl":null,"url":null,"abstract":"Reflecting upon recent advances in Natural Language Processing (NLP), this paper evaluates the effectiveness of context-aware NLP models for predicting software task effort estimates. Term Frequency–Inverse Document Frequency (TF-IDF) and Bidirectional Encoder Representations from Transformers (BERT) were used as feature extraction methods; Random forest and BERT feed-forward linear neural networks were used as classifiers. Using three datasets drawn from open-source projects and one from a commercial project, the paper evaluates the models and compares the best performing model with expert estimates from both kinds of datasets. The results suggest that BERT as feature extraction and classifier shows slightly better performance than other combinations, but that there is no significant difference between the presented methods. On the other hand, the results show that expert and Machine Learning (ML) estimate performances are similar, with the experts’ performance being slightly better. Both findings confirmed existing literature, but using substantially different experimental settings.","PeriodicalId":300084,"journal":{"name":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME55016.2022.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Reflecting upon recent advances in Natural Language Processing (NLP), this paper evaluates the effectiveness of context-aware NLP models for predicting software task effort estimates. Term Frequency–Inverse Document Frequency (TF-IDF) and Bidirectional Encoder Representations from Transformers (BERT) were used as feature extraction methods; Random forest and BERT feed-forward linear neural networks were used as classifiers. Using three datasets drawn from open-source projects and one from a commercial project, the paper evaluates the models and compares the best performing model with expert estimates from both kinds of datasets. The results suggest that BERT as feature extraction and classifier shows slightly better performance than other combinations, but that there is no significant difference between the presented methods. On the other hand, the results show that expert and Machine Learning (ML) estimate performances are similar, with the experts’ performance being slightly better. Both findings confirmed existing literature, but using substantially different experimental settings.

查看原文本刊更多论文

上下文感知语言模型的评估和专家对软件维护问题的工作量评估

回顾自然语言处理(NLP)的最新进展，本文评估了上下文感知的NLP模型在预测软件任务工作量估计方面的有效性。使用词频-逆文档频率(TF-IDF)和双向编码器表示(BERT)作为特征提取方法;采用随机森林和BERT前馈线性神经网络作为分类器。本文使用来自开源项目的三个数据集和一个来自商业项目的数据集，对模型进行评估，并将表现最佳的模型与来自两种数据集的专家估计进行比较。结果表明，BERT作为特征提取和分类器的性能略好于其他组合，但两种方法之间没有显著差异。另一方面，结果表明专家和机器学习(ML)的估计性能相似，专家的性能略好。这两项发现都证实了现有文献，但采用了截然不同的实验环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)

自引率

0.00%

发文量