Lan Huang, Jiayuan Zhang, Bo Wang, Zixu Li, Shu Wang, Rui Zhang
{"title":"基于角色的全局作者姓名消歧框架","authors":"Lan Huang, Jiayuan Zhang, Bo Wang, Zixu Li, Shu Wang, Rui Zhang","doi":"10.1016/j.patcog.2025.111703","DOIUrl":null,"url":null,"abstract":"<div><div>The academic community has long been confronted with the issue of Author Name Disambiguation (AND), where different authors share the same name. Most existing methods formalize AND as a task of clustering papers, based on the assumption that the more similar the papers are, the more likely they are to be the work of the same researcher. This paper introduces a framework for global role-based author name disambiguation, GRAND. It redefines the problem of AND by distinguishing between a real-world researcher and the author roles he/she plays, formalizing it as a role player matching problem. Furthermore, it proposes an embedding and clustering strategy based on meta-path, combined with a global coauthor sampling algorithm to address ambiguity in coauthor pairs. Finally, a set of rule-based metrics are employed to match real-world researchers with their author roles. The innovation of GRAND lies in its combination of global meta-path embedding method and rule-based author mapping. It effectively handles fuzzy coauthor relationships. In addition, it combines local and global information, and it improves disambiguation by distinguishing between researchers and the author roles they plays. The experimental results show GRAND out-performs several state-of-the-art approaches, with the F1-score improving by 0.49% to 5.45% across the three datasets.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111703"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A framework for global role-based author name disambiguation\",\"authors\":\"Lan Huang, Jiayuan Zhang, Bo Wang, Zixu Li, Shu Wang, Rui Zhang\",\"doi\":\"10.1016/j.patcog.2025.111703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The academic community has long been confronted with the issue of Author Name Disambiguation (AND), where different authors share the same name. Most existing methods formalize AND as a task of clustering papers, based on the assumption that the more similar the papers are, the more likely they are to be the work of the same researcher. This paper introduces a framework for global role-based author name disambiguation, GRAND. It redefines the problem of AND by distinguishing between a real-world researcher and the author roles he/she plays, formalizing it as a role player matching problem. Furthermore, it proposes an embedding and clustering strategy based on meta-path, combined with a global coauthor sampling algorithm to address ambiguity in coauthor pairs. Finally, a set of rule-based metrics are employed to match real-world researchers with their author roles. The innovation of GRAND lies in its combination of global meta-path embedding method and rule-based author mapping. It effectively handles fuzzy coauthor relationships. In addition, it combines local and global information, and it improves disambiguation by distinguishing between researchers and the author roles they plays. The experimental results show GRAND out-performs several state-of-the-art approaches, with the F1-score improving by 0.49% to 5.45% across the three datasets.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"166 \",\"pages\":\"Article 111703\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325003632\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325003632","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A framework for global role-based author name disambiguation
The academic community has long been confronted with the issue of Author Name Disambiguation (AND), where different authors share the same name. Most existing methods formalize AND as a task of clustering papers, based on the assumption that the more similar the papers are, the more likely they are to be the work of the same researcher. This paper introduces a framework for global role-based author name disambiguation, GRAND. It redefines the problem of AND by distinguishing between a real-world researcher and the author roles he/she plays, formalizing it as a role player matching problem. Furthermore, it proposes an embedding and clustering strategy based on meta-path, combined with a global coauthor sampling algorithm to address ambiguity in coauthor pairs. Finally, a set of rule-based metrics are employed to match real-world researchers with their author roles. The innovation of GRAND lies in its combination of global meta-path embedding method and rule-based author mapping. It effectively handles fuzzy coauthor relationships. In addition, it combines local and global information, and it improves disambiguation by distinguishing between researchers and the author roles they plays. The experimental results show GRAND out-performs several state-of-the-art approaches, with the F1-score improving by 0.49% to 5.45% across the three datasets.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.