{"title":"Machine learning with word embedding for detecting web-services anti-patterns","authors":"Lov Kumar , Sahithi Tummalapalli , Sonika Chandrakant Rathi , Lalita Bhanu Murthy , Aneesh Krishna , Sanjay Misra","doi":"10.1016/j.cola.2023.101207","DOIUrl":null,"url":null,"abstract":"<div><p>Software design Anti-pattern is the common feedback to a recurring problem that is ineffective and has a high risk of failure. Early prediction of these Anti-patterns helps reduce the design process’s efforts, resources, and costs. In earlier research, static code or Web Service Description Language (WSDL) metrics were used to develop anti-pattern prediction models. These source code metrics are calculated at either file-level or system-level. So, the values of these metrics are frequently dependent on assumptions that are not defined or standardized and might vary depending on the tools available. This study aims to develop a machine learning-based Anti-patterns prediction model using natural language processing techniques for representing the WSDL file as an input. In this research, the four-word embedding methods have been used to process the WSDL file. The processed outputs are used as input to the models trained using thirty-three classifier techniques. This study also uses eight feature selection techniques to remove ineffective features and five data sampling techniques to handle the class imbalance nature of the datasets. The results indicate that the developed models using text metrics perform better than the static code or WSDL metrics. Additionally, the results suggest that selecting features using feature selection and balancing data using sampling techniques helps improve the models’ performance.</p></div>","PeriodicalId":48552,"journal":{"name":"Journal of Computer Languages","volume":"75 ","pages":"Article 101207"},"PeriodicalIF":1.7000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Languages","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590118423000175","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Software design Anti-pattern is the common feedback to a recurring problem that is ineffective and has a high risk of failure. Early prediction of these Anti-patterns helps reduce the design process’s efforts, resources, and costs. In earlier research, static code or Web Service Description Language (WSDL) metrics were used to develop anti-pattern prediction models. These source code metrics are calculated at either file-level or system-level. So, the values of these metrics are frequently dependent on assumptions that are not defined or standardized and might vary depending on the tools available. This study aims to develop a machine learning-based Anti-patterns prediction model using natural language processing techniques for representing the WSDL file as an input. In this research, the four-word embedding methods have been used to process the WSDL file. The processed outputs are used as input to the models trained using thirty-three classifier techniques. This study also uses eight feature selection techniques to remove ineffective features and five data sampling techniques to handle the class imbalance nature of the datasets. The results indicate that the developed models using text metrics perform better than the static code or WSDL metrics. Additionally, the results suggest that selecting features using feature selection and balancing data using sampling techniques helps improve the models’ performance.