一种使用地理分析和域名系统数据进行网络威胁检测的无监督机器学习方法

Seyed-Ali Sadegh-Zadeh, Mostafa Tajdini
{"title":"一种使用地理分析和域名系统数据进行网络威胁检测的无监督机器学习方法","authors":"Seyed-Ali Sadegh-Zadeh,&nbsp;Mostafa Tajdini","doi":"10.1016/j.dajour.2025.100576","DOIUrl":null,"url":null,"abstract":"<div><div>Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled​ datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.</div></div>","PeriodicalId":100357,"journal":{"name":"Decision Analytics Journal","volume":"15 ","pages":"Article 100576"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An unsupervised machine learning approach for cyber threat detection using geographic profiling and Domain Name System data\",\"authors\":\"Seyed-Ali Sadegh-Zadeh,&nbsp;Mostafa Tajdini\",\"doi\":\"10.1016/j.dajour.2025.100576\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled​ datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.</div></div>\",\"PeriodicalId\":100357,\"journal\":{\"name\":\"Decision Analytics Journal\",\"volume\":\"15 \",\"pages\":\"Article 100576\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Decision Analytics Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772662225000323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Analytics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772662225000323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络威胁检测是网络安全领域的一项关键挑战,现有的许多解决方案依赖于基于规则的系统、监督学习模型和基于熵的异常检测。然而,基于规则的方法往往受限于它们对预定义签名的依赖,这使得它们对新的攻击无效。监督学习方法需要广泛的标记数据集,这些数据集通常不可用或由于不断变化的威胁而迅速过时。传统的基于熵的异常检测技术在应用于大规模DNS流量时存在高误报率和计算效率低下的问题。这些限制需要一种更具适应性和可扩展性的方法。本研究将地理分析与域名系统(DNS)数据分析相结合,以增强网络威胁检测,提供了一种通过地理洞察来理解网络威胁的新方法。主要目标是开发无监督机器学习模型,以识别基于DNS查询异常的潜在恶意IP地址,利用地理位置和DNS行为之间的相关性。该方法利用k均值聚类对地理位置和被动DNS数据集进行处理,检测异常,识别网络威胁热点。我们的研究结果证明了地理分析在网络威胁情报中的有效性,K-means聚类获得了0.985的高剪影分数,表明了良好分离和有意义的威胁分组。此外,我们基于熵的异常检测识别高风险DNS活动的准确率为92.3%,与传统DNS监测技术相比,减少了误报。地理空间分析显示,82%的网络威胁来自15个高熵区域,与全球网络安全事件报告一致。提出的预测框架显著改善了网络威胁检测,增强了实时威胁可见性和响应能力。通过将地理分析与DNS数据分析相结合,我们通过提供对网络威胁的更细致和数据驱动的理解来推进网络安全防御。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An unsupervised machine learning approach for cyber threat detection using geographic profiling and Domain Name System data
Cyber threat detection is a critical challenge in cybersecurity, with numerous existing solutions relying on rule-based systems, supervised learning models, and entropy-based anomaly detection. However, rule-based methods are often limited by their dependence on predefined signatures, making them ineffective against novel attacks. Supervised learning approaches require extensive labelled​ datasets, which are often unavailable or quickly outdated due to evolving threats. Traditional entropy-based anomaly detection techniques struggle with high false positive rates and computational inefficiencies when applied to large-scale DNS traffic. These limitations necessitate a more adaptive and scalable approach. This study integrates geographic profiling with Domain Name System (DNS) data analysis to enhance cyber threat detection, offering a novel approach to understanding cyber threats through geographical insights. The primary objective is to develop unsupervised machine learning models to identify potentially malicious IP addresses based on DNS query anomalies, leveraging the correlation between geographic locations and DNS behaviours. The proposed method utilizes K-means clustering to process geolocation and passive DNS datasets, detect anomalies, and identify cyber threat hotspots. Our results demonstrate the effectiveness of geographic profiling in cyber threat intelligence, with K-means clustering achieving a high silhouette score of 0.985, indicating well-separated and meaningful threat groupings. Additionally, our entropy-based anomaly detection identified high-risk DNS activities with an accuracy of 92.3%, reducing false positives compared to traditional DNS monitoring techniques. The geospatial analysis revealed that 82% of cyber threats originate from 15 high-entropy regions, aligning with global cybersecurity incident reports. The proposed predictive framework significantly improves cyber threat detection, enhancing real-time threat visibility and response capabilities. By integrating geographic profiling with DNS data analysis, we advance cybersecurity defences by providing a more nuanced and data-driven understanding of cyber threats.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信