2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)最新文献_第2页

GitHub Issue Classification Using BERT-Style Models GitHub发布使用bert风格模型的分类

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-05-01 DOI: 10.1145/3528588.3528663

Shikhar Bharadwaj, Tushar Kadam

引用次数: 9

On the Evaluation of NLP-based Models for Software Engineering 基于nlp的软件工程模型评价研究

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-03-31 DOI: 10.1145/3528588.3528665

M. Izadi, Matin Nili Ahmadabadi

引用次数: 6

CatIss: An Intelligent Tool for Categorizing Issues Reports using Transformers catatis:一个使用变压器对问题报告进行分类的智能工具

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-03-31 DOI: 10.1145/3528588.3528662

M. Izadi

{"title":"CatIss: An Intelligent Tool for Categorizing Issues Reports using Transformers","authors":"M. Izadi","doi":"10.1145/3528588.3528662","DOIUrl":"https://doi.org/10.1145/3528588.3528662","url":null,"abstract":"Users use Issue Tracking Systems to keep track and manage issue reports in their repositories. An issue is a rich source of software information that contains different reports including a problem, a request for new features, or merely a question about the software product. As the number of these issues increases, it becomes harder to manage them manually. Thus, automatic approaches are proposed to help facilitate the management of issue reports. This paper describes CatIss, an automatic Categorizer of Issue reports which is built upon the Transformer-based pre-trained RoBERTa model. CatIss classifies issue reports into three main categories of Bug report, Enhancement/feature request, and Question. First, the datasets provided for the NLBSE tool competition are cleaned and preprocessed. Then, the pre-trained RoBERTa model is fine-tuned on the preprocessed dataset. Evaluating CatIss on about 80 thousand issue reports from GitHub, indicates that it performs very well surpassing the competition baseline, TicketTagger, and achieving 87.2% F1-score (micro average). Additionally, as CatIss is trained on a wide set of repositories, it is a generic prediction model, hence applicable for any unseen software project or projects with little historical data. Scripts for cleaning the datasets, training CatIss and evaluating the model are publicly available. 1","PeriodicalId":313397,"journal":{"name":"2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131764525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Can NMT Understand Me? Towards Perturbation-based Evaluation of NMT Models for Code Generation NMT能理解我吗?基于微扰的代码生成NMT模型评价

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-03-29 DOI: 10.1145/3528588.3528653

Pietro Liguori, Cristina Improta, S. D. Vivo, R. Natella, B. Cukic, Domenico Cotroneo

引用次数: 3

Understanding Digits in Identifier Names: An Exploratory Study 理解标识符名称中的数字:一项探索性研究

2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) Pub Date : 2022-02-28 DOI: 10.1145/3528588.3528657

Anthony Peruma, Christian D. Newman

引用次数: 2