基于变压器模型的网络威胁情报命名实体识别

Pavlos Evangelatos, Christos Iliou, T. Mavropoulos, Konstantinos Apostolou, T. Tsikrika, S. Vrochidis, Y. Kompatsiaris
{"title":"基于变压器模型的网络威胁情报命名实体识别","authors":"Pavlos Evangelatos, Christos Iliou, T. Mavropoulos, Konstantinos Apostolou, T. Tsikrika, S. Vrochidis, Y. Kompatsiaris","doi":"10.1109/CSR51186.2021.9527981","DOIUrl":null,"url":null,"abstract":"The continuous increase in sophistication of threat actors over the years has made the use of actionable threat intelligence a critical part of the defence against them. Such Cyber Threat Intelligence is published daily on several online sources, including vulnerability databases, CERT feeds, and social media, as well as on forums and web pages from the Surface and the Dark Web. Named Entity Recognition (NER) techniques can be used to extract the aforementioned information in an actionable form from such sources. In this paper we investigate how the latest advances in the NER domain, and in particular transformer-based models, can facilitate this process. To this end, the dataset for NER in Threat Intelligence (DNRTI) containing more than 300 pieces of threat intelligence reports from open source threat intelligence websites is used. Our experimental results demonstrate that transformer-based techniques are very effective in extracting cybersecurity-related named entities, by considerably outperforming the previous state- of-the-art approaches tested with DNRTI.","PeriodicalId":253300,"journal":{"name":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Named Entity Recognition in Cyber Threat Intelligence Using Transformer-based Models\",\"authors\":\"Pavlos Evangelatos, Christos Iliou, T. Mavropoulos, Konstantinos Apostolou, T. Tsikrika, S. Vrochidis, Y. Kompatsiaris\",\"doi\":\"10.1109/CSR51186.2021.9527981\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The continuous increase in sophistication of threat actors over the years has made the use of actionable threat intelligence a critical part of the defence against them. Such Cyber Threat Intelligence is published daily on several online sources, including vulnerability databases, CERT feeds, and social media, as well as on forums and web pages from the Surface and the Dark Web. Named Entity Recognition (NER) techniques can be used to extract the aforementioned information in an actionable form from such sources. In this paper we investigate how the latest advances in the NER domain, and in particular transformer-based models, can facilitate this process. To this end, the dataset for NER in Threat Intelligence (DNRTI) containing more than 300 pieces of threat intelligence reports from open source threat intelligence websites is used. Our experimental results demonstrate that transformer-based techniques are very effective in extracting cybersecurity-related named entities, by considerably outperforming the previous state- of-the-art approaches tested with DNRTI.\",\"PeriodicalId\":253300,\"journal\":{\"name\":\"2021 IEEE International Conference on Cyber Security and Resilience (CSR)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Cyber Security and Resilience (CSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSR51186.2021.9527981\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Cyber Security and Resilience (CSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSR51186.2021.9527981","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

多年来,威胁行为者的复杂性不断增加,使得使用可操作的威胁情报成为防御他们的关键部分。此类网络威胁情报每天都会在多个在线来源上发布,包括漏洞数据库、CERT feed和社交媒体,以及来自Surface和Dark web的论坛和网页。命名实体识别(NER)技术可用于从这些来源中以可操作的形式提取上述信息。在本文中,我们研究了NER领域的最新进展,特别是基于变压器的模型,如何促进这一过程。为此,使用了来自开源威胁情报网站的300多份威胁情报报告的威胁情报数据库(DNRTI)。我们的实验结果表明,基于变压器的技术在提取与网络安全相关的命名实体方面非常有效,其性能大大优于先前使用DNRTI测试的最先进方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Named Entity Recognition in Cyber Threat Intelligence Using Transformer-based Models
The continuous increase in sophistication of threat actors over the years has made the use of actionable threat intelligence a critical part of the defence against them. Such Cyber Threat Intelligence is published daily on several online sources, including vulnerability databases, CERT feeds, and social media, as well as on forums and web pages from the Surface and the Dark Web. Named Entity Recognition (NER) techniques can be used to extract the aforementioned information in an actionable form from such sources. In this paper we investigate how the latest advances in the NER domain, and in particular transformer-based models, can facilitate this process. To this end, the dataset for NER in Threat Intelligence (DNRTI) containing more than 300 pieces of threat intelligence reports from open source threat intelligence websites is used. Our experimental results demonstrate that transformer-based techniques are very effective in extracting cybersecurity-related named entities, by considerably outperforming the previous state- of-the-art approaches tested with DNRTI.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信