Automatic Tagging of Cyber Threat Intelligence Unstructured Data using Semantics Extraction

Tianyi Wang, Kam-pui Chow
{"title":"Automatic Tagging of Cyber Threat Intelligence Unstructured Data using Semantics Extraction","authors":"Tianyi Wang, Kam-pui Chow","doi":"10.1109/ISI.2019.8823252","DOIUrl":null,"url":null,"abstract":"Threat intelligence, information about potential or current attacks to an organization, is an important component in cyber security territory. As new threats consecutively occurring, cyber security professionals always keep an eye on the latest threat intelligence in order to continuously lower the security risks for their organizations. Cyber threat intelligence is usually conveyed by structured data like CVE entities and unstructured data like articles and reports. Structured data are always under certain patterns that can be easily analyzed, while unstructured data have more difficulties to find fixed patterns to analyze. There exists plenty of methods and algorithms on information extraction from structured data, but no current work is complete or suitable for semantics extraction upon unstructured cyber threat intelligence data. In this paper, we introduce an idea of automatic tagging applying JAPE feature within GATE framework to perform semantics extraction upon cyber threat intelligence unstructured data such as articles and reports. We extract token entities from each cyber threat intelligence article or report and evaluate the usefulness of them. A threat intelligence ontology then can be constructed with the useful entities extracted from related resources and provide convenience for professionals to find latest useful threat intelligence they need.","PeriodicalId":156130,"journal":{"name":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Intelligence and Security Informatics (ISI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2019.8823252","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Threat intelligence, information about potential or current attacks to an organization, is an important component in cyber security territory. As new threats consecutively occurring, cyber security professionals always keep an eye on the latest threat intelligence in order to continuously lower the security risks for their organizations. Cyber threat intelligence is usually conveyed by structured data like CVE entities and unstructured data like articles and reports. Structured data are always under certain patterns that can be easily analyzed, while unstructured data have more difficulties to find fixed patterns to analyze. There exists plenty of methods and algorithms on information extraction from structured data, but no current work is complete or suitable for semantics extraction upon unstructured cyber threat intelligence data. In this paper, we introduce an idea of automatic tagging applying JAPE feature within GATE framework to perform semantics extraction upon cyber threat intelligence unstructured data such as articles and reports. We extract token entities from each cyber threat intelligence article or report and evaluate the usefulness of them. A threat intelligence ontology then can be constructed with the useful entities extracted from related resources and provide convenience for professionals to find latest useful threat intelligence they need.
基于语义抽取的网络威胁情报非结构化数据自动标注
威胁情报,即有关对组织的潜在或当前攻击的信息,是网络安全领域的重要组成部分。随着新的威胁不断出现,网络安全专业人员时刻关注最新的威胁情报,以不断降低组织的安全风险。网络威胁情报通常由结构化数据(如CVE实体)和非结构化数据(如文章和报告)传达。结构化数据总是处于一定的模式下,易于分析,而非结构化数据更难找到固定的模式进行分析。从结构化数据中提取信息的方法和算法很多,但目前还没有完整的或适合于对非结构化网络威胁情报数据进行语义提取的工作。本文提出了一种利用GATE框架下的JAPE特征对文章、报告等网络威胁情报非结构化数据进行语义提取的自动标注思想。我们从每个网络威胁情报文章或报告中提取令牌实体,并评估它们的有用性。从相关资源中提取有用实体,构建威胁情报本体,方便专业人员查找所需的最新威胁情报。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信