{"title":"G2MBCF:用于敏感实体识别的增强命名实体识别","authors":"Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui","doi":"10.1016/j.datak.2025.102444","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102444"},"PeriodicalIF":2.7000,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification\",\"authors\":\"Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui\",\"doi\":\"10.1016/j.datak.2025.102444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"159 \",\"pages\":\"Article 102444\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X25000394\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000394","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification
With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances and of SEI. We also studied the influence of parameters in the experiments.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.