{"title":"Modified MMS: Minimization Approach for Model Subset Selection","authors":"C. Rajathi, P. Rukmani","doi":"10.32604/cmc.2023.041507","DOIUrl":null,"url":null,"abstract":"Considering the recent developments in the digital environment, ensuring a higher level of security for networking systems is imperative. Many security approaches are being constantly developed to protect against evolving threats. An ensemble model for the intrusion classification system yielded promising results based on the knowledge of many prior studies. This research work aimed to create a more diverse and effective ensemble model. To this end, selected six classification models, Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) from existing study to run as independent models. Once the individual models were trained, a Correlation-Based Diversity Matrix (CDM) was created by determining their closeness. The models for the ensemble were chosen by the proposed Modified Minimization Approach for Model Subset Selection (Modified-MMS) from Lower triangular-CDM (L-CDM) as input. The proposed algorithm performance was assessed using the Network Security Laboratory—Knowledge Discovery in Databases (NSL-KDD) dataset, and several performance metrics, including accuracy, precision, recall, and F1-score. By selecting a diverse set of models, the proposed system enhances the performance of an ensemble by reducing overfitting and increasing prediction accuracy. The proposed work achieved an impressive accuracy of 99.26%, using only two classification models in an ensemble, which surpasses the performance of a larger ensemble that employs six classification models.","PeriodicalId":93535,"journal":{"name":"Computers, materials & continua","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers, materials & continua","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/cmc.2023.041507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Considering the recent developments in the digital environment, ensuring a higher level of security for networking systems is imperative. Many security approaches are being constantly developed to protect against evolving threats. An ensemble model for the intrusion classification system yielded promising results based on the knowledge of many prior studies. This research work aimed to create a more diverse and effective ensemble model. To this end, selected six classification models, Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) from existing study to run as independent models. Once the individual models were trained, a Correlation-Based Diversity Matrix (CDM) was created by determining their closeness. The models for the ensemble were chosen by the proposed Modified Minimization Approach for Model Subset Selection (Modified-MMS) from Lower triangular-CDM (L-CDM) as input. The proposed algorithm performance was assessed using the Network Security Laboratory—Knowledge Discovery in Databases (NSL-KDD) dataset, and several performance metrics, including accuracy, precision, recall, and F1-score. By selecting a diverse set of models, the proposed system enhances the performance of an ensemble by reducing overfitting and increasing prediction accuracy. The proposed work achieved an impressive accuracy of 99.26%, using only two classification models in an ensemble, which surpasses the performance of a larger ensemble that employs six classification models.