Fusion Deep Learning and Machine Learning for Multi-source Heterogeneous Military Entity Recognition

Hui Li, Lin Yu, Mingqi Lyu, Yuwen Qian
{"title":"Fusion Deep Learning and Machine Learning for Multi-source Heterogeneous Military Entity Recognition","authors":"Hui Li, Lin Yu, Mingqi Lyu, Yuwen Qian","doi":"10.1109/TOCS53301.2021.9688813","DOIUrl":null,"url":null,"abstract":"Currently, there is less research works on military entity recognition across corpus. Three types of military entity recognition corpus, namely abbreviated, scientific or English name, novel and random, are constructed to improve the construction of sub-scenario datasets, according to the heterogeneous characteristics of military entities in different military data sets, such as military documents, reconnaissance intelligence, simulation training mission conception, military books, military blogs, military reviews and military websites. With respect to the fuzzy boundaries of military heterogeneous entities we improve the entity annotation mechanism for entity with fuzzy boundaries based on related research works. We apply a BERT-BiLSTM-CRF model fusing deep learning and machine learning to recognize military entities, and design multiple types of experiments to verify the practical effects of the model. Experimental results show that the recognition effect of the model keeps improving with the increasing size of the corpus in the multi-data source scenario, with the F-score increasing from 73.56% to 84.53%.","PeriodicalId":360004,"journal":{"name":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TOCS53301.2021.9688813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Currently, there is less research works on military entity recognition across corpus. Three types of military entity recognition corpus, namely abbreviated, scientific or English name, novel and random, are constructed to improve the construction of sub-scenario datasets, according to the heterogeneous characteristics of military entities in different military data sets, such as military documents, reconnaissance intelligence, simulation training mission conception, military books, military blogs, military reviews and military websites. With respect to the fuzzy boundaries of military heterogeneous entities we improve the entity annotation mechanism for entity with fuzzy boundaries based on related research works. We apply a BERT-BiLSTM-CRF model fusing deep learning and machine learning to recognize military entities, and design multiple types of experiments to verify the practical effects of the model. Experimental results show that the recognition effect of the model keeps improving with the increasing size of the corpus in the multi-data source scenario, with the F-score increasing from 73.56% to 84.53%.
多源异构军事实体识别的融合深度学习与机器学习
目前,跨语料库的军事实体识别研究较少。针对军事文献、侦察情报、模拟训练任务构想、军事图书、军事博客、军事评论、军事网站等不同军事数据集中军事实体的异构特征,构建了缩写、科学或英文名称、新颖和随机三种军事实体识别语料库,改进了子场景数据集的构建。针对军事异构实体模糊边界问题,在相关研究成果的基础上,改进了模糊边界实体的实体标注机制。我们采用融合深度学习和机器学习的BERT-BiLSTM-CRF模型来识别军事实体,并设计了多种类型的实验来验证模型的实际效果。实验结果表明,在多数据源场景下,随着语料库规模的增加,模型的识别效果不断提高,f分数从73.56%提高到84.53%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信