{"title":"流数据的主权感知入侵检测:自动机器学习管道和语义推理","authors":"Ayan Chatterjee , Sundar Gopalakrishnan , Ayan Mondal","doi":"10.1016/j.procs.2025.02.066","DOIUrl":null,"url":null,"abstract":"<div><div>Intrusion Detection Systems (IDS) are critical in safeguarding network infrastructures against malicious attacks. Traditional IDSs often struggle with knowledge representation, real-time detection, and accuracy, especially when dealing with high-throughput data. This paper proposes a novel IDS framework that leverages machine learning models, streaming data, and semantic knowledge representation to enhance intrusion detection accuracy and scalability. Additionally, the study incorporates the concept of Digital Sovereignty, ensuring that data control, security, and privacy are maintained according to national and regional regulations. The proposed system integrates Apache Kafka for real-time data processing, an automatic machine learning pipeline (e.g., Tree-based Pipeline Optimization Tool (TPOT)) for classifying network traffic, and OWL-based semantic reasoning for advanced threat detection. The proposed system, evaluated on NSL-KDD and CIC-IDS-2017 datasets, demonstrated qualitative outcomes such as local compliance, reduced data storage needs due to real-time processing, and improved adaptability to local data laws. Experimental results reveal significant improvements in detection accuracy, processing efficiency, and Sovereignty alignment.</div></div>","PeriodicalId":20465,"journal":{"name":"Procedia Computer Science","volume":"254 ","pages":"Pages 78-87"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sovereignty-Aware Intrusion Detection on Streaming Data: Automatic Machine Learning Pipeline and Semantic Reasoning\",\"authors\":\"Ayan Chatterjee , Sundar Gopalakrishnan , Ayan Mondal\",\"doi\":\"10.1016/j.procs.2025.02.066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Intrusion Detection Systems (IDS) are critical in safeguarding network infrastructures against malicious attacks. Traditional IDSs often struggle with knowledge representation, real-time detection, and accuracy, especially when dealing with high-throughput data. This paper proposes a novel IDS framework that leverages machine learning models, streaming data, and semantic knowledge representation to enhance intrusion detection accuracy and scalability. Additionally, the study incorporates the concept of Digital Sovereignty, ensuring that data control, security, and privacy are maintained according to national and regional regulations. The proposed system integrates Apache Kafka for real-time data processing, an automatic machine learning pipeline (e.g., Tree-based Pipeline Optimization Tool (TPOT)) for classifying network traffic, and OWL-based semantic reasoning for advanced threat detection. The proposed system, evaluated on NSL-KDD and CIC-IDS-2017 datasets, demonstrated qualitative outcomes such as local compliance, reduced data storage needs due to real-time processing, and improved adaptability to local data laws. Experimental results reveal significant improvements in detection accuracy, processing efficiency, and Sovereignty alignment.</div></div>\",\"PeriodicalId\":20465,\"journal\":{\"name\":\"Procedia Computer Science\",\"volume\":\"254 \",\"pages\":\"Pages 78-87\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Procedia Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1877050925004168\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Procedia Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877050925004168","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sovereignty-Aware Intrusion Detection on Streaming Data: Automatic Machine Learning Pipeline and Semantic Reasoning
Intrusion Detection Systems (IDS) are critical in safeguarding network infrastructures against malicious attacks. Traditional IDSs often struggle with knowledge representation, real-time detection, and accuracy, especially when dealing with high-throughput data. This paper proposes a novel IDS framework that leverages machine learning models, streaming data, and semantic knowledge representation to enhance intrusion detection accuracy and scalability. Additionally, the study incorporates the concept of Digital Sovereignty, ensuring that data control, security, and privacy are maintained according to national and regional regulations. The proposed system integrates Apache Kafka for real-time data processing, an automatic machine learning pipeline (e.g., Tree-based Pipeline Optimization Tool (TPOT)) for classifying network traffic, and OWL-based semantic reasoning for advanced threat detection. The proposed system, evaluated on NSL-KDD and CIC-IDS-2017 datasets, demonstrated qualitative outcomes such as local compliance, reduced data storage needs due to real-time processing, and improved adaptability to local data laws. Experimental results reveal significant improvements in detection accuracy, processing efficiency, and Sovereignty alignment.