{"title":"Automated detection of class diagram smells using self-supervised learning","authors":"Amal Alazba, Hamoud Aljamaan, Mohammad Alshayeb","doi":"10.1007/s10515-024-00429-w","DOIUrl":null,"url":null,"abstract":"<div><p>Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.</p></div>","PeriodicalId":55414,"journal":{"name":"Automated Software Engineering","volume":"31 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10515-024-00429-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Design smells are symptoms of poorly designed solutions that may result in several maintenance issues. While various approaches, including traditional machine learning methods, have been proposed and shown to be effective in detecting design smells, they require extensive manually labeled data, which is expensive and challenging to scale. To leverage the vast amount of data that is now accessible, unsupervised semantic feature learning, or learning without requiring manual annotation labor, is essential. The goal of this paper is to propose a design smell detection method that is based on self-supervised learning. We propose Model Representation with Transformers (MoRT) to learn the UML class diagram features by training Transformers to recognize masked keywords. We empirically show how effective the defined proxy task is at learning semantic and structural properties. We thoroughly assess MoRT using four model smells: the Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife. Furthermore, we compare our findings with supervised learning and feature-based methods. Finally, we ran a cross-project experiment to assess the generalizability of our approach. Results show that MoRT is highly effective in detecting design smells.
期刊介绍:
This journal details research, tutorial papers, survey and accounts of significant industrial experience in the foundations, techniques, tools and applications of automated software engineering technology. This includes the study of techniques for constructing, understanding, adapting, and modeling software artifacts and processes.
Coverage in Automated Software Engineering examines both automatic systems and collaborative systems as well as computational models of human software engineering activities. In addition, it presents knowledge representations and artificial intelligence techniques applicable to automated software engineering, and formal techniques that support or provide theoretical foundations. The journal also includes reviews of books, software, conferences and workshops.