{"title":"一种使用地理分析和域名系统数据进行网络威胁检测的无监督机器学习方法","authors":"Seyed-Ali Sadegh-Zadeh, Mostafa Tajdini","doi":"10.1016/j.dajour.2025.100576","DOIUrl":null,"url":null,"abstract":"<div><div>Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.</div></div>","PeriodicalId":100357,"journal":{"name":"Decision Analytics Journal","volume":"15 ","pages":"Article 100576"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An unsupervised machine learning approach for cyber threat detection using geographic profiling and Domain Name System data\",\"authors\":\"Seyed-Ali Sadegh-Zadeh, Mostafa Tajdini\",\"doi\":\"10.1016/j.dajour.2025.100576\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.</div></div>\",\"PeriodicalId\":100357,\"journal\":{\"name\":\"Decision Analytics Journal\",\"volume\":\"15 \",\"pages\":\"Article 100576\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Decision Analytics Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772662225000323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Analytics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772662225000323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An unsupervised machine learning approach for cyber threat detection using geographic profiling and Domain Name System data
Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.