Modified MMS: Minimization Approach for Model Subset Selection

IF 1.7

Computers, materials & continua Pub Date : 2023-01-01 DOI:10.32604/cmc.2023.041507

C. Rajathi, P. Rukmani

{"title":"Modified MMS: Minimization Approach for Model Subset Selection","authors":"C. Rajathi, P. Rukmani","doi":"10.32604/cmc.2023.041507","DOIUrl":null,"url":null,"abstract":"Considering the recent developments in the digital environment, ensuring a higher level of security for networking systems is imperative. Many security approaches are being constantly developed to protect against evolving threats. An ensemble model for the intrusion classification system yielded promising results based on the knowledge of many prior studies. This research work aimed to create a more diverse and effective ensemble model. To this end, selected six classification models, Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) from existing study to run as independent models. Once the individual models were trained, a Correlation-Based Diversity Matrix (CDM) was created by determining their closeness. The models for the ensemble were chosen by the proposed Modified Minimization Approach for Model Subset Selection (Modified-MMS) from Lower triangular-CDM (L-CDM) as input. The proposed algorithm performance was assessed using the Network Security Laboratory—Knowledge Discovery in Databases (NSL-KDD) dataset, and several performance metrics, including accuracy, precision, recall, and F1-score. By selecting a diverse set of models, the proposed system enhances the performance of an ensemble by reducing overfitting and increasing prediction accuracy. The proposed work achieved an impressive accuracy of 99.26%, using only two classification models in an ensemble, which surpasses the performance of a larger ensemble that employs six classification models.","PeriodicalId":93535,"journal":{"name":"Computers, materials & continua","volume":"40 1","pages":"0"},"PeriodicalIF":1.7000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers, materials & continua","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32604/cmc.2023.041507","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Considering the recent developments in the digital environment, ensuring a higher level of security for networking systems is imperative. Many security approaches are being constantly developed to protect against evolving threats. An ensemble model for the intrusion classification system yielded promising results based on the knowledge of many prior studies. This research work aimed to create a more diverse and effective ensemble model. To this end, selected six classification models, Logistic Regression (LR), Naive Bayes (NB), K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), and Random Forest (RF) from existing study to run as independent models. Once the individual models were trained, a Correlation-Based Diversity Matrix (CDM) was created by determining their closeness. The models for the ensemble were chosen by the proposed Modified Minimization Approach for Model Subset Selection (Modified-MMS) from Lower triangular-CDM (L-CDM) as input. The proposed algorithm performance was assessed using the Network Security Laboratory—Knowledge Discovery in Databases (NSL-KDD) dataset, and several performance metrics, including accuracy, precision, recall, and F1-score. By selecting a diverse set of models, the proposed system enhances the performance of an ensemble by reducing overfitting and increasing prediction accuracy. The proposed work achieved an impressive accuracy of 99.26%, using only two classification models in an ensemble, which surpasses the performance of a larger ensemble that employs six classification models.

查看原文本刊更多论文

修正MMS:模型子集选择的最小化方法

考虑到数字环境的最新发展，确保网络系统的更高级别的安全性势在必行。为了防止不断变化的威胁，正在不断开发许多安全方法。在前人研究的基础上，建立了入侵分类系统集成模型，取得了令人满意的结果。本研究旨在建立一个更多样化、更有效的集成模型。为此，从已有研究中选取Logistic回归(LR)、朴素贝叶斯(NB)、k近邻(KNN)、决策树(DT)、支持向量机(SVM)、随机森林(RF) 6个分类模型作为独立模型运行。一旦对单个模型进行了训练，通过确定它们的接近度来创建基于关联的多样性矩阵(CDM)。以下三角cdm (L-CDM)为输入，采用改进的模型子集选择最小化方法(Modified- mms)选择集成模型。使用网络安全实验室-数据库中的知识发现(NSL-KDD)数据集和几个性能指标(包括准确性、精密度、召回率和f1分数)评估了所提出的算法的性能。通过选择不同的模型集，该系统通过减少过拟合和提高预测精度来增强集成的性能。所提出的工作实现了99.26%的令人印象深刻的准确率，在一个集成中仅使用两个分类模型，这超过了使用六个分类模型的更大集成的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers, materials & continua

自引率

0.00%

发文量