G2MBCF：用于敏感实体识别的增强命名实体识别

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Data & Knowledge Engineering Pub Date : 2025-04-26 DOI:10.1016/j.datak.2025.102444

Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui

{"title":"G2MBCF：用于敏感实体识别的增强命名实体识别","authors":"Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui","doi":"10.1016/j.datak.2025.102444","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102444"},"PeriodicalIF":2.7000,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification\",\"authors\":\"Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui\",\"doi\":\"10.1016/j.datak.2025.102444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"159 \",\"pages\":\"Article 102444\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X25000394\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000394","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

随着数据的不断增长，数据安全工作变得越来越重要。敏感实体识别（SEI）问题作为重要数据检测的核心，已成为自然语言处理（NLP）科学研究的热点。命名实体识别（NER）是命名实体识别的基础，但目前的研究仅将命名实体识别作为命名实体识别问题的一个特例。它缺乏对实体和关系之间隐含联系的更详细的考虑。本文提出了一种基于潜在因子模型（LFM）的新型增强方法G2MBCF。我们使用知识图来表示具有语义结构的NER初级结果。然后，我们使用G2MBCF通过E - R矩阵来嵌入实体和关系，以挖掘隐式连接。实验表明，与现有的NER方法相比，我们的方法提高了SEI的查全率和查准率。我们还研究了实验中参数的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification

With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a

E - R

matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances

R e c a l l

and

P r e c i s i o n

of SEI. We also studied the influence of parameters in the experiments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Data & Knowledge Engineering 工程技术-计算机：人工智能

CiteScore

5.00

自引率

0.00%

发文量

审稿时长

6 months

期刊介绍： Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.