Semi-supervised intrusion detection system for in-vehicle networks based on variational autoencoder and adversarial reinforcement learning

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-09-28 DOI:10.1016/j.knosys.2024.112563

{"title":"Semi-supervised intrusion detection system for in-vehicle networks based on variational autoencoder and adversarial reinforcement learning","authors":"","doi":"10.1016/j.knosys.2024.112563","DOIUrl":null,"url":null,"abstract":"<div><div>Despite the affordability, simplicity, and efficiency of controller area network (CAN) protocols, the security vulnerability remains a major challenge. Currently, a machine learning-based intrusion detection system (IDS) is considered an effective approach for improving security in CAN by identifying malicious attacks. However, earlier studies that relied on supervised learning methods required considerable amounts of labeled data. Data collection from vehicles is time-consuming and expensive. Furthermore, the obtained data exhibited a class imbalance, which presents further challenges in the analysis and model training. Thus, we propose a semi-supervised learning-based IDS that combines variational autoencoder (VAE) and adversarial reinforcement learning for the multi-class classification of both known and unknown attacks. The proposed system capitalizes on the diverse patterns inherent in unlabeled data, transforming this data space into one that is more conducive to classification. Concurrently, adversarial agents in the reinforcement learning algorithm interact competitively, progressively enhancing their ability to intelligently classify and select samples. To reduce the reliance on labeled data and effectively exploit them, we utilize a pseudo-labeling process for pre-training. Experimental results indicate that the proposed model achieves more effective classification while requiring less labeled data compared to other baseline models for known attacks. By inheriting the advantages of VAE, promising results demonstrate that the proposed system detects unknown attacks containing similar or completely different characteristics with high F1 scores exceeding 0.9 and 0.84, respectively. Finally, the proposed system was demonstrated to be a lightweight model for the expeditious detection of malevolent messages introduced into in-vehicle networks to ensure minimal latency.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124011973","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Despite the affordability, simplicity, and efficiency of controller area network (CAN) protocols, the security vulnerability remains a major challenge. Currently, a machine learning-based intrusion detection system (IDS) is considered an effective approach for improving security in CAN by identifying malicious attacks. However, earlier studies that relied on supervised learning methods required considerable amounts of labeled data. Data collection from vehicles is time-consuming and expensive. Furthermore, the obtained data exhibited a class imbalance, which presents further challenges in the analysis and model training. Thus, we propose a semi-supervised learning-based IDS that combines variational autoencoder (VAE) and adversarial reinforcement learning for the multi-class classification of both known and unknown attacks. The proposed system capitalizes on the diverse patterns inherent in unlabeled data, transforming this data space into one that is more conducive to classification. Concurrently, adversarial agents in the reinforcement learning algorithm interact competitively, progressively enhancing their ability to intelligently classify and select samples. To reduce the reliance on labeled data and effectively exploit them, we utilize a pseudo-labeling process for pre-training. Experimental results indicate that the proposed model achieves more effective classification while requiring less labeled data compared to other baseline models for known attacks. By inheriting the advantages of VAE, promising results demonstrate that the proposed system detects unknown attacks containing similar or completely different characteristics with high F1 scores exceeding 0.9 and 0.84, respectively. Finally, the proposed system was demonstrated to be a lightweight model for the expeditious detection of malevolent messages introduced into in-vehicle networks to ensure minimal latency.

查看原文本刊更多论文

基于变异自动编码器和对抗强化学习的车载网络半监督入侵检测系统

尽管控制器区域网络（CAN）协议经济实惠、简单高效，但安全漏洞仍是一大挑战。目前，基于机器学习的入侵检测系统（IDS）被认为是通过识别恶意攻击来提高 CAN 安全性的有效方法。然而，早期的研究依赖于监督学习方法，需要大量的标记数据。从车辆收集数据既耗时又昂贵。此外，获得的数据显示出类不平衡，这给分析和模型训练带来了更多挑战。因此，我们提出了一种基于半监督学习的 IDS，它结合了变异自动编码器（VAE）和对抗强化学习，可对已知和未知攻击进行多类分类。所提出的系统利用了未标记数据中固有的各种模式，将这一数据空间转化为更有利于分类的空间。同时，强化学习算法中的对抗代理进行竞争性互动，逐步增强其智能分类和选择样本的能力。为了减少对标记数据的依赖并有效利用这些数据，我们利用伪标记过程进行预训练。实验结果表明，与其他针对已知攻击的基线模型相比，所提出的模型实现了更有效的分类，同时需要的标记数据更少。通过继承 VAE 的优势，实验结果表明，所提出的系统能检测出包含相似或完全不同特征的未知攻击，F1 分数分别超过 0.9 和 0.84。最后，拟议的系统被证明是一种轻量级模型，可快速检测引入车载网络的恶意信息，确保将延迟降到最低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.