{"title":"基于弱监督的威胁情报实体识别学习","authors":"Yaru Yang, Zhi Liu, Jiaxing Song","doi":"10.1145/3573834.3574526","DOIUrl":null,"url":null,"abstract":"The emergence of threat intelligence provides more foundation for tracing the source of network attacks, but it also necessitates a significant amount of manual analysis. Although data-driven automatic information extraction can effectively reduce labor consumption, it is limited by a lack of labeled data in the field of threat intelligence. To overcome this limitation, we propose TRAPPER, a threat entity recognition framework that can infer real threat entities from unlabeled threat sentences, avoiding the difficult labeling work. TRAPPER relies on label functions and three components, label aggregator, label predictor, and label expander, which guides the model with weak supervision and uses transfer knowledge as an aid. The label functions permit us to inject expert knowledge into the label aggregator to generate the inputs needed by the label predictor. It enables the label predictor to learn to recognize threat entities. The label expander combines the multi-source noisy label information with the transferred entity recognition semantic knowledge to further expand the entities. Throughout the process, the components promote each other by learning from each other. Comparative experiments on three threat intelligence-related datasets show that our method can effectively identify threat entities and achieve a maximum F1 score improvement of 6.3% over the best baseline.","PeriodicalId":345434,"journal":{"name":"Proceedings of the 4th International Conference on Advanced Information Science and System","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TRAPPER:Learning with Weak Supervision for Threat Intelligence Entity Recognition\",\"authors\":\"Yaru Yang, Zhi Liu, Jiaxing Song\",\"doi\":\"10.1145/3573834.3574526\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emergence of threat intelligence provides more foundation for tracing the source of network attacks, but it also necessitates a significant amount of manual analysis. Although data-driven automatic information extraction can effectively reduce labor consumption, it is limited by a lack of labeled data in the field of threat intelligence. To overcome this limitation, we propose TRAPPER, a threat entity recognition framework that can infer real threat entities from unlabeled threat sentences, avoiding the difficult labeling work. TRAPPER relies on label functions and three components, label aggregator, label predictor, and label expander, which guides the model with weak supervision and uses transfer knowledge as an aid. The label functions permit us to inject expert knowledge into the label aggregator to generate the inputs needed by the label predictor. It enables the label predictor to learn to recognize threat entities. The label expander combines the multi-source noisy label information with the transferred entity recognition semantic knowledge to further expand the entities. Throughout the process, the components promote each other by learning from each other. Comparative experiments on three threat intelligence-related datasets show that our method can effectively identify threat entities and achieve a maximum F1 score improvement of 6.3% over the best baseline.\",\"PeriodicalId\":345434,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Advanced Information Science and System\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Advanced Information Science and System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3573834.3574526\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Advanced Information Science and System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3573834.3574526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TRAPPER:Learning with Weak Supervision for Threat Intelligence Entity Recognition
The emergence of threat intelligence provides more foundation for tracing the source of network attacks, but it also necessitates a significant amount of manual analysis. Although data-driven automatic information extraction can effectively reduce labor consumption, it is limited by a lack of labeled data in the field of threat intelligence. To overcome this limitation, we propose TRAPPER, a threat entity recognition framework that can infer real threat entities from unlabeled threat sentences, avoiding the difficult labeling work. TRAPPER relies on label functions and three components, label aggregator, label predictor, and label expander, which guides the model with weak supervision and uses transfer knowledge as an aid. The label functions permit us to inject expert knowledge into the label aggregator to generate the inputs needed by the label predictor. It enables the label predictor to learn to recognize threat entities. The label expander combines the multi-source noisy label information with the transferred entity recognition semantic knowledge to further expand the entities. Throughout the process, the components promote each other by learning from each other. Comparative experiments on three threat intelligence-related datasets show that our method can effectively identify threat entities and achieve a maximum F1 score improvement of 6.3% over the best baseline.