用seBERT预测问题类型

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-05-01 DOI:10.1145/3528588.3528661

Alexander Trautsch, S. Herbold

{"title":"用seBERT预测问题类型","authors":"Alexander Trautsch, S. Herbold","doi":"10.1145/3528588.3528661","DOIUrl":null,"url":null,"abstract":"Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precision to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.","PeriodicalId":313397,"journal":{"name":"2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Predicting Issue Types with seBERT\",\"authors\":\"Alexander Trautsch, S. Herbold\",\"doi\":\"10.1145/3528588.3528661\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precision to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.\",\"PeriodicalId\":313397,\"journal\":{\"name\":\"2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3528588.3528661\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3528588.3528661","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

预训练的变压器模型是目前自然语言模型处理的最先进技术。seBERT就是这样一个模型，它是基于BERT架构开发的，但是用软件工程数据从头开始训练。我们对这个模型进行了微调，以适应NLBSE挑战的问题类型预测任务。我们的模型在所有三种问题类型的召回率和准确率方面都优于基线fastText，达到了85.7%的f1总分，比基线提高了4.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Predicting Issue Types with seBERT

Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precision to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)

自引率

0.00%

发文量