G2MBCF:用于敏感实体识别的增强命名实体识别

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Weibin Tian , Kaiming Gu , Shihui Xiao , Junbo Zhang , Wei Cui
{"title":"G2MBCF:用于敏感实体识别的增强命名实体识别","authors":"Weibin Tian ,&nbsp;Kaiming Gu ,&nbsp;Shihui Xiao ,&nbsp;Junbo Zhang ,&nbsp;Wei Cui","doi":"10.1016/j.datak.2025.102444","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102444"},"PeriodicalIF":2.7000,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification\",\"authors\":\"Weibin Tian ,&nbsp;Kaiming Gu ,&nbsp;Shihui Xiao ,&nbsp;Junbo Zhang ,&nbsp;Wei Cui\",\"doi\":\"10.1016/j.datak.2025.102444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a <span><math><mrow><mi>E</mi><mo>−</mo><mi>R</mi></mrow></math></span> matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances <span><math><mrow><mi>R</mi><mi>e</mi><mi>c</mi><mi>a</mi><mi>l</mi><mi>l</mi></mrow></math></span> and <span><math><mrow><mi>P</mi><mi>r</mi><mi>e</mi><mi>c</mi><mi>i</mi><mi>s</mi><mi>i</mi><mi>o</mi><mi>n</mi></mrow></math></span> of SEI. We also studied the influence of parameters in the experiments.</div></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"159 \",\"pages\":\"Article 102444\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X25000394\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000394","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

随着数据的不断增长,数据安全工作变得越来越重要。敏感实体识别(SEI)问题作为重要数据检测的核心,已成为自然语言处理(NLP)科学研究的热点。命名实体识别(NER)是命名实体识别的基础,但目前的研究仅将命名实体识别作为命名实体识别问题的一个特例。它缺乏对实体和关系之间隐含联系的更详细的考虑。本文提出了一种基于潜在因子模型(LFM)的新型增强方法G2MBCF。我们使用知识图来表示具有语义结构的NER初级结果。然后,我们使用G2MBCF通过E - R矩阵来嵌入实体和关系,以挖掘隐式连接。实验表明,与现有的NER方法相比,我们的方法提高了SEI的查全率和查准率。我们还研究了实验中参数的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification
With the increasing growth of data, work on data security is becoming increasingly important. As the core of important data detection, the sensitive entities identification (SEI) problem has become a hot topic in natural language processing (NLP) science. Named Entity Recognition (NER) is the foundation of SEI, however, current studies treat SEI only as a special case of the NER problem. It lacks more detailed considerations of implicit links between entities and relations. In this paper, we propose a novel enhanced method called G2MBCF based on latent factor model (LFM). We use knowledge graph to represent the NER primary result with semantic structure. Then we use G2MBCF to inscribe entities and relations through a ER matrix to mine implicit connections. Experiments show that compared to existing NER methods, our method enhances Recall and Precision of SEI. We also studied the influence of parameters in the experiments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信