{"title":"Fusion Deep Learning and Machine Learning for Multi-source Heterogeneous Military Entity Recognition","authors":"Hui Li, Lin Yu, Mingqi Lyu, Yuwen Qian","doi":"10.1109/TOCS53301.2021.9688813","DOIUrl":null,"url":null,"abstract":"Currently, there is less research works on military entity recognition across corpus. Three types of military entity recognition corpus, namely abbreviated, scientific or English name, novel and random, are constructed to improve the construction of sub-scenario datasets, according to the heterogeneous characteristics of military entities in different military data sets, such as military documents, reconnaissance intelligence, simulation training mission conception, military books, military blogs, military reviews and military websites. With respect to the fuzzy boundaries of military heterogeneous entities we improve the entity annotation mechanism for entity with fuzzy boundaries based on related research works. We apply a BERT-BiLSTM-CRF model fusing deep learning and machine learning to recognize military entities, and design multiple types of experiments to verify the practical effects of the model. Experimental results show that the recognition effect of the model keeps improving with the increasing size of the corpus in the multi-data source scenario, with the F-score increasing from 73.56% to 84.53%.","PeriodicalId":360004,"journal":{"name":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TOCS53301.2021.9688813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Currently, there is less research works on military entity recognition across corpus. Three types of military entity recognition corpus, namely abbreviated, scientific or English name, novel and random, are constructed to improve the construction of sub-scenario datasets, according to the heterogeneous characteristics of military entities in different military data sets, such as military documents, reconnaissance intelligence, simulation training mission conception, military books, military blogs, military reviews and military websites. With respect to the fuzzy boundaries of military heterogeneous entities we improve the entity annotation mechanism for entity with fuzzy boundaries based on related research works. We apply a BERT-BiLSTM-CRF model fusing deep learning and machine learning to recognize military entities, and design multiple types of experiments to verify the practical effects of the model. Experimental results show that the recognition effect of the model keeps improving with the increasing size of the corpus in the multi-data source scenario, with the F-score increasing from 73.56% to 84.53%.