{"title":"针对不平衡数据和信息冗余的因果遗传网络异常现象检测方法","authors":"Zengri Zeng;Xuhui Liu;Ming Dai;Jian Zheng;Xiaoheng Deng;Detian Zeng;Jie Chen","doi":"10.1109/TNSM.2024.3455768","DOIUrl":null,"url":null,"abstract":"The proliferation of Internet-connected devices and the complexity of modern network environments have led to the collection of massive and high-dimensional datasets, resulting in substantial information redundancy and sample imbalance issues. These challenges not only hinder the computational efficiency and generalizability of anomaly detection systems but also compromise their ability to detect rare attack types, posing significant security threats. To address these pressing issues, we propose a novel causal genetic network-based anomaly detection method, the CNSGA, which integrates causal inference and the nondominated sorting genetic algorithm-III (NSGA-III). The CNSGA leverages causal reasoning to exclude irrelevant information, focusing solely on the features that are causally related to the outcome labels. Simultaneously, NSGA-III iteratively eliminates redundant information and prioritizes minority samples, thereby enhancing detection performance. To quantitatively assess the improvements achieved, we introduce two indices: a detection balance index and an optimal feature subset index. These indices, along with the causal effect weights, serve as fitness metrics for iterative optimization. The optimized individuals are then selected for subsequent population generation on the basis of nondominated reference point ordering. The experimental results obtained with four real-world network attack datasets demonstrate that the CNSGA significantly outperforms existing methods in terms of overall precision, the imbalance index, and the optimal feature subset index, with maximum increases exceeding 10%, 0.5, and 50%, respectively. Notably, for the CICDDoS2019 dataset, the CNSGA requires only 16-dimensional features to effectively detect more than 70% of all sample types, including 6 more network attack sample types than the other methods detect. The significance and impact of this work encompass the ability to eliminate redundant information, increase detection rates, balance attack detection systems, and ensure stability and generalizability. The proposed CNSGA framework represents a significant step forward in developing efficient and accurate anomaly detection systems capable of defending against a wide range of cyber threats in complex network environments.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"21 6","pages":"6937-6952"},"PeriodicalIF":4.7000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Causal Genetic Network Anomaly Detection Method for Imbalanced Data and Information Redundancy\",\"authors\":\"Zengri Zeng;Xuhui Liu;Ming Dai;Jian Zheng;Xiaoheng Deng;Detian Zeng;Jie Chen\",\"doi\":\"10.1109/TNSM.2024.3455768\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The proliferation of Internet-connected devices and the complexity of modern network environments have led to the collection of massive and high-dimensional datasets, resulting in substantial information redundancy and sample imbalance issues. These challenges not only hinder the computational efficiency and generalizability of anomaly detection systems but also compromise their ability to detect rare attack types, posing significant security threats. To address these pressing issues, we propose a novel causal genetic network-based anomaly detection method, the CNSGA, which integrates causal inference and the nondominated sorting genetic algorithm-III (NSGA-III). The CNSGA leverages causal reasoning to exclude irrelevant information, focusing solely on the features that are causally related to the outcome labels. Simultaneously, NSGA-III iteratively eliminates redundant information and prioritizes minority samples, thereby enhancing detection performance. To quantitatively assess the improvements achieved, we introduce two indices: a detection balance index and an optimal feature subset index. These indices, along with the causal effect weights, serve as fitness metrics for iterative optimization. The optimized individuals are then selected for subsequent population generation on the basis of nondominated reference point ordering. The experimental results obtained with four real-world network attack datasets demonstrate that the CNSGA significantly outperforms existing methods in terms of overall precision, the imbalance index, and the optimal feature subset index, with maximum increases exceeding 10%, 0.5, and 50%, respectively. Notably, for the CICDDoS2019 dataset, the CNSGA requires only 16-dimensional features to effectively detect more than 70% of all sample types, including 6 more network attack sample types than the other methods detect. The significance and impact of this work encompass the ability to eliminate redundant information, increase detection rates, balance attack detection systems, and ensure stability and generalizability. The proposed CNSGA framework represents a significant step forward in developing efficient and accurate anomaly detection systems capable of defending against a wide range of cyber threats in complex network environments.\",\"PeriodicalId\":13423,\"journal\":{\"name\":\"IEEE Transactions on Network and Service Management\",\"volume\":\"21 6\",\"pages\":\"6937-6952\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Network and Service Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10668849/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10668849/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Causal Genetic Network Anomaly Detection Method for Imbalanced Data and Information Redundancy
The proliferation of Internet-connected devices and the complexity of modern network environments have led to the collection of massive and high-dimensional datasets, resulting in substantial information redundancy and sample imbalance issues. These challenges not only hinder the computational efficiency and generalizability of anomaly detection systems but also compromise their ability to detect rare attack types, posing significant security threats. To address these pressing issues, we propose a novel causal genetic network-based anomaly detection method, the CNSGA, which integrates causal inference and the nondominated sorting genetic algorithm-III (NSGA-III). The CNSGA leverages causal reasoning to exclude irrelevant information, focusing solely on the features that are causally related to the outcome labels. Simultaneously, NSGA-III iteratively eliminates redundant information and prioritizes minority samples, thereby enhancing detection performance. To quantitatively assess the improvements achieved, we introduce two indices: a detection balance index and an optimal feature subset index. These indices, along with the causal effect weights, serve as fitness metrics for iterative optimization. The optimized individuals are then selected for subsequent population generation on the basis of nondominated reference point ordering. The experimental results obtained with four real-world network attack datasets demonstrate that the CNSGA significantly outperforms existing methods in terms of overall precision, the imbalance index, and the optimal feature subset index, with maximum increases exceeding 10%, 0.5, and 50%, respectively. Notably, for the CICDDoS2019 dataset, the CNSGA requires only 16-dimensional features to effectively detect more than 70% of all sample types, including 6 more network attack sample types than the other methods detect. The significance and impact of this work encompass the ability to eliminate redundant information, increase detection rates, balance attack detection systems, and ensure stability and generalizability. The proposed CNSGA framework represents a significant step forward in developing efficient and accurate anomaly detection systems capable of defending against a wide range of cyber threats in complex network environments.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.