{"title":"EnhanceCTI:针对行业特定网络威胁情报的增强语义过滤和特征提取框架","authors":"Sheng-Shan Chen, Tun-Wen Pai, Chin-Yu Sun","doi":"10.1016/j.cose.2025.104649","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid digitization of various industries has created an urgent need for robust cyber threat intelligence (CTI) systems. Organizations are increasingly developing cyber threat intelligence platforms (TIPs) to gather open-source intelligence (OSINT) and transform it into actionable defenses against information security breaches. However, the overwhelming volume and complexity of OSINT data, often including false or misleading information, pose significant challenges for effective CTI analysis. This study introduces EnhanceCTI, a novel system designed to improve the quality and industry-specific applicability of threat intelligence. EnhanceCTI employs an enhanced bidirectional encoder representations from transformers (DistilBERT)-based semantic filtering method to filter intelligence data and determine its alignment with industry-specific data extracted from TIPs. This filtering is applied across eight major industries: healthcare, finance, government, technology, education, telecommunications, critical infrastructure, and a miscellaneous “others” category. Additionally, EnhanceCTI leverages high-credibility CTI features, integrating them with SentenceBERT to create a merging judgment model. This model determines whether a given piece of intelligence should be merged with existing data or stored independently, thereby ensuring relevance and minimizing redundancy. Finally, a dedicated platform was developed, providing cybersecurity analysts with tools to rapidly assess both intelligence quality and the accuracy of industry-specific classification models. Experimental results demonstrate EnhanceCTI’s effectiveness, achieving an F1-score of 0.99 for intelligence identification and a 0.89 cosine Pearson correlation for SentenceBERT. A random forest algorithm, trained on 750 manually annotated samples, achieved an F1-score of 0.97 on the merging judgment model. These findings highlight EnhanceCTI’s ability to accurately identify threats, offering a valuable, industry-tailored solution for institutions facing the growing challenges of cybersecurity in the modern digital landscape.</div></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":"158 ","pages":"Article 104649"},"PeriodicalIF":5.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EnhanceCTI: Enhanced semantic filtering and feature extraction framework for industry-specific cyber threat intelligence\",\"authors\":\"Sheng-Shan Chen, Tun-Wen Pai, Chin-Yu Sun\",\"doi\":\"10.1016/j.cose.2025.104649\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The rapid digitization of various industries has created an urgent need for robust cyber threat intelligence (CTI) systems. Organizations are increasingly developing cyber threat intelligence platforms (TIPs) to gather open-source intelligence (OSINT) and transform it into actionable defenses against information security breaches. However, the overwhelming volume and complexity of OSINT data, often including false or misleading information, pose significant challenges for effective CTI analysis. This study introduces EnhanceCTI, a novel system designed to improve the quality and industry-specific applicability of threat intelligence. EnhanceCTI employs an enhanced bidirectional encoder representations from transformers (DistilBERT)-based semantic filtering method to filter intelligence data and determine its alignment with industry-specific data extracted from TIPs. This filtering is applied across eight major industries: healthcare, finance, government, technology, education, telecommunications, critical infrastructure, and a miscellaneous “others” category. Additionally, EnhanceCTI leverages high-credibility CTI features, integrating them with SentenceBERT to create a merging judgment model. This model determines whether a given piece of intelligence should be merged with existing data or stored independently, thereby ensuring relevance and minimizing redundancy. Finally, a dedicated platform was developed, providing cybersecurity analysts with tools to rapidly assess both intelligence quality and the accuracy of industry-specific classification models. Experimental results demonstrate EnhanceCTI’s effectiveness, achieving an F1-score of 0.99 for intelligence identification and a 0.89 cosine Pearson correlation for SentenceBERT. A random forest algorithm, trained on 750 manually annotated samples, achieved an F1-score of 0.97 on the merging judgment model. These findings highlight EnhanceCTI’s ability to accurately identify threats, offering a valuable, industry-tailored solution for institutions facing the growing challenges of cybersecurity in the modern digital landscape.</div></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":\"158 \",\"pages\":\"Article 104649\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404825003384\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404825003384","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
EnhanceCTI: Enhanced semantic filtering and feature extraction framework for industry-specific cyber threat intelligence
The rapid digitization of various industries has created an urgent need for robust cyber threat intelligence (CTI) systems. Organizations are increasingly developing cyber threat intelligence platforms (TIPs) to gather open-source intelligence (OSINT) and transform it into actionable defenses against information security breaches. However, the overwhelming volume and complexity of OSINT data, often including false or misleading information, pose significant challenges for effective CTI analysis. This study introduces EnhanceCTI, a novel system designed to improve the quality and industry-specific applicability of threat intelligence. EnhanceCTI employs an enhanced bidirectional encoder representations from transformers (DistilBERT)-based semantic filtering method to filter intelligence data and determine its alignment with industry-specific data extracted from TIPs. This filtering is applied across eight major industries: healthcare, finance, government, technology, education, telecommunications, critical infrastructure, and a miscellaneous “others” category. Additionally, EnhanceCTI leverages high-credibility CTI features, integrating them with SentenceBERT to create a merging judgment model. This model determines whether a given piece of intelligence should be merged with existing data or stored independently, thereby ensuring relevance and minimizing redundancy. Finally, a dedicated platform was developed, providing cybersecurity analysts with tools to rapidly assess both intelligence quality and the accuracy of industry-specific classification models. Experimental results demonstrate EnhanceCTI’s effectiveness, achieving an F1-score of 0.99 for intelligence identification and a 0.89 cosine Pearson correlation for SentenceBERT. A random forest algorithm, trained on 750 manually annotated samples, achieved an F1-score of 0.97 on the merging judgment model. These findings highlight EnhanceCTI’s ability to accurately identify threats, offering a valuable, industry-tailored solution for institutions facing the growing challenges of cybersecurity in the modern digital landscape.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.