Multi-label Classification of Commit Messages using Transfer Learning

2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) Pub Date : 2020-10-01 DOI:10.1109/ISSREW51248.2020.00034

Muhammad Usman Sarwar, Sarim Zafar, Mohamed Wiem Mkaouer, G. Walia, Muhammad Zubair Malik

{"title":"Multi-label Classification of Commit Messages using Transfer Learning","authors":"Muhammad Usman Sarwar, Sarim Zafar, Mohamed Wiem Mkaouer, G. Walia, Muhammad Zubair Malik","doi":"10.1109/ISSREW51248.2020.00034","DOIUrl":null,"url":null,"abstract":"Commit messages are used in the industry by developers to annotate changes made to the code. Accurate classification of these messages can help monitor the software evolution process and enable better tracking for various industrial stakeholders. In this paper, we present a state of the art method for commit message classification into categories as per Swanson’s maintenance activities i.e. “Corrective”, “Perfective”, and “Adaptive”. This is a challenging task because not all commit messages are well written and informative. Existing approaches rely on keyword-based techniques to solve this problem. However, these approaches are oblivious to the full language model and do not recognize the contextual relationship between words. State of the art methodology in Natural Language Processing (NLP), is to train a context-aware neural network (Transformer) on a very large data set that encompasses the entire language and then fine-tunes it for a specific task. In this way, the model can learn the language, pay attention to the context, and then transfer that knowledge for better performance at the specific task. We use an off-the-shelf neural network called DistilBERT and fine-tune it for commit message classification task. This step is non-trivial because programming languages and commit messages have unique keywords, jargon, and idioms. This paper presents our effort in training this model and constructing the data set for this task. We describe the rules used to construct the data set. We validate our approach on industrial projects from GitHub, such as Kubernetes, Linux, TensorFlow, Spark, TypeScript, and PyTorch. We were able to achieve 87% F1-score for the commit message classification task, which is an order of magnitude accurate than previous studies.","PeriodicalId":202247,"journal":{"name":"2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSREW51248.2020.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Commit messages are used in the industry by developers to annotate changes made to the code. Accurate classification of these messages can help monitor the software evolution process and enable better tracking for various industrial stakeholders. In this paper, we present a state of the art method for commit message classification into categories as per Swanson’s maintenance activities i.e. “Corrective”, “Perfective”, and “Adaptive”. This is a challenging task because not all commit messages are well written and informative. Existing approaches rely on keyword-based techniques to solve this problem. However, these approaches are oblivious to the full language model and do not recognize the contextual relationship between words. State of the art methodology in Natural Language Processing (NLP), is to train a context-aware neural network (Transformer) on a very large data set that encompasses the entire language and then fine-tunes it for a specific task. In this way, the model can learn the language, pay attention to the context, and then transfer that knowledge for better performance at the specific task. We use an off-the-shelf neural network called DistilBERT and fine-tune it for commit message classification task. This step is non-trivial because programming languages and commit messages have unique keywords, jargon, and idioms. This paper presents our effort in training this model and constructing the data set for this task. We describe the rules used to construct the data set. We validate our approach on industrial projects from GitHub, such as Kubernetes, Linux, TensorFlow, Spark, TypeScript, and PyTorch. We were able to achieve 87% F1-score for the commit message classification task, which is an order of magnitude accurate than previous studies.

查看原文本刊更多论文

基于迁移学习的提交消息多标签分类

在业界，开发人员使用提交消息来注释对代码所做的更改。对这些消息进行准确的分类可以帮助监视软件发展过程，并对各种行业涉众进行更好的跟踪。在本文中，我们提出了一种最先进的提交消息分类方法，根据Swanson的维护活动，即“纠正”、“完善”和“自适应”，将消息分类。这是一项具有挑战性的任务，因为并不是所有提交消息都写得很好，而且信息丰富。现有的方法依赖于基于关键字的技术来解决这个问题。然而，这些方法忽略了完整的语言模型，并且不能识别单词之间的上下文关系。自然语言处理(NLP)中最先进的方法是在包含整个语言的非常大的数据集上训练上下文感知神经网络(Transformer)，然后针对特定任务对其进行微调。通过这种方式，模型可以学习语言，关注上下文，然后将这些知识转移到特定任务中以获得更好的表现。我们使用一个现成的神经网络蒸馏器，并对其进行微调以完成提交消息分类任务。这一步很重要，因为编程语言和提交消息都有独特的关键字、术语和习惯用法。本文介绍了我们在训练该模型和构建该任务的数据集方面所做的努力。我们描述用于构造数据集的规则。我们在来自GitHub的工业项目上验证了我们的方法，比如Kubernetes、Linux、TensorFlow、Spark、TypeScript和PyTorch。我们能够在提交消息分类任务中获得87%的f1分数，这比以前的研究准确了一个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)

自引率

0.00%

发文量