Classification Tendency Difference Index Model for Feature Selection and Extraction in Wireless Intrusion Detection

Future Internet Pub Date : 2024-01-12 DOI:10.3390/fi16010025

C. Tseng, Woei-Jiunn Tsaur, Yueh-Mao Shen

{"title":"Classification Tendency Difference Index Model for Feature Selection and Extraction in Wireless Intrusion Detection","authors":"C. Tseng, Woei-Jiunn Tsaur, Yueh-Mao Shen","doi":"10.3390/fi16010025","DOIUrl":null,"url":null,"abstract":"In detecting large-scale attacks, deep neural networks (DNNs) are an effective approach based on high-quality training data samples. Feature selection and feature extraction are the primary approaches for data quality enhancement for high-accuracy intrusion detection. However, their enhancement root causes usually present weak relationships to the differences between normal and attack behaviors in the data samples. Thus, we propose a Classification Tendency Difference Index (CTDI) model for feature selection and extraction in intrusion detection. The CTDI model consists of three indexes: Classification Tendency Frequency Difference (CTFD), Classification Tendency Membership Difference (CTMD), and Classification Tendency Distance Difference (CTDD). In the dataset, each feature has many feature values (FVs). In each FV, the normal and attack samples indicate the FV classification tendency, and CTDI shows the classification tendency differences between the normal and attack samples. CTFD is the frequency difference between the normal and attack samples. By employing fuzzy C means (FCM) to establish the normal and attack clusters, CTMD is the membership difference between the clusters, and CTDD is the distance difference between the cluster centers. CTDI calculates the index score in each FV and summarizes the scores of all FVs in the feature as the feature score for each of the three indexes. CTDI adopts an Auto Encoder for feature extraction to generate new features from the dataset and calculate the three index scores for the new features. CTDI sorts the original and new features for each of the three indexes to select the best features. The selected CTDI features indicate the best classification tendency differences between normal and attack samples. The experiment results demonstrate that the CTDI features achieve better detection accuracy as classified by DNN for the Aegean WiFi Intrusion Dataset than their related works, and the detection enhancements are based on the improved classification tendency differences in the CTDI features.","PeriodicalId":509567,"journal":{"name":"Future Internet","volume":"48 23","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/fi16010025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In detecting large-scale attacks, deep neural networks (DNNs) are an effective approach based on high-quality training data samples. Feature selection and feature extraction are the primary approaches for data quality enhancement for high-accuracy intrusion detection. However, their enhancement root causes usually present weak relationships to the differences between normal and attack behaviors in the data samples. Thus, we propose a Classification Tendency Difference Index (CTDI) model for feature selection and extraction in intrusion detection. The CTDI model consists of three indexes: Classification Tendency Frequency Difference (CTFD), Classification Tendency Membership Difference (CTMD), and Classification Tendency Distance Difference (CTDD). In the dataset, each feature has many feature values (FVs). In each FV, the normal and attack samples indicate the FV classification tendency, and CTDI shows the classification tendency differences between the normal and attack samples. CTFD is the frequency difference between the normal and attack samples. By employing fuzzy C means (FCM) to establish the normal and attack clusters, CTMD is the membership difference between the clusters, and CTDD is the distance difference between the cluster centers. CTDI calculates the index score in each FV and summarizes the scores of all FVs in the feature as the feature score for each of the three indexes. CTDI adopts an Auto Encoder for feature extraction to generate new features from the dataset and calculate the three index scores for the new features. CTDI sorts the original and new features for each of the three indexes to select the best features. The selected CTDI features indicate the best classification tendency differences between normal and attack samples. The experiment results demonstrate that the CTDI features achieve better detection accuracy as classified by DNN for the Aegean WiFi Intrusion Dataset than their related works, and the detection enhancements are based on the improved classification tendency differences in the CTDI features.

查看原文本刊更多论文

用于无线入侵检测中特征选择和提取的分类倾向差异指数模型

在检测大规模攻击时，深度神经网络（DNN）是一种基于高质量训练数据样本的有效方法。特征选择和特征提取是提高数据质量以实现高精度入侵检测的主要方法。然而，它们的增强根源通常与数据样本中正常行为和攻击行为之间的差异关系不大。因此，我们提出了一种用于入侵检测中特征选择和提取的分类倾向差异指数（CTDI）模型。CTDI 模型由三个指数组成：分类倾向频率差（CTFD）、分类倾向成员差（CTMD）和分类倾向距离差（CTDD）。在数据集中，每个特征都有许多特征值（FV）。在每个 FV 中，正常样本和攻击样本表示 FV 的分类倾向，CTDI 表示正常样本和攻击样本之间的分类倾向差异。CTFD 是正常样本和攻击样本之间的频率差异。通过使用模糊 C 平均法（FCM）建立正常样本和攻击样本聚类，CTMD 是聚类之间的成员差异，CTDD 是聚类中心之间的距离差异。CTDI 计算每个 FV 中的指数得分，并将特征中所有 FV 的得分汇总为三个指数的特征得分。CTDI 采用自动编码器进行特征提取，从数据集中生成新特征，并计算新特征的三个指数得分。CTDI 对原始特征和新特征的三个指标进行排序，选出最佳特征。选出的 CTDI 特征显示了正常样本和攻击样本之间的最佳分类倾向差异。实验结果表明，在爱琴海 WiFi 入侵数据集上，CTDI 特征经 DNN 分类后的检测准确率优于其相关作品，而检测增强正是基于 CTDI 特征改进后的分类趋势差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Future Internet

自引率

0.00%

发文量