Qiang Zhou, Wei Chen, Weiqing Wang, Jiajie Xu, Lei Zhao
{"title":"多特征驱动的作者姓名消歧","authors":"Qiang Zhou, Wei Chen, Weiqing Wang, Jiajie Xu, Lei Zhao","doi":"10.1109/ICWS53863.2021.00071","DOIUrl":null,"url":null,"abstract":"Author Name Disambiguation (AND) has received more attention recently, accompanied by the increase of academic publications. To tackle the AND problem, existing studies have proposed many approaches based on different types of information, such as raw document feature (e.g., co-author, title, and keywords), fusion feature (e.g., a hybrid publication embedding based on raw document feature), local structural information (e.g., a publication's neighborhood information on a graph), and global structural information (e.g., the interactive information between a node and others on a graph). However, there has been no work taking all the above-mentioned information into account for the AND problem so far. To fill the gap, we propose a novel framework namely MFAND (Multiple Features Driven Author Name Disambiguation). Specifically, we first employ the raw document and fusion feature to construct six similarity graphs for each author name to be disambiguated. Next, the global and local structural information extracted from these graphs is fed into a novel encoder called R3JG, which integrates and reconstructs the above-mentioned four types of information associated with an author, with the goal of learning the latent information to enhance the generalization ability of the MFAND. Then, the integrated and reconstructed information is fed into a binary classification model for disambiguation. Note that, several pruning strategies are applied before the information extraction to remove noise effectively. Finally, our proposed framework is investigated on two real-world datasets, and the experimental results show that MFAND performs better than all state-of-the-art methods.","PeriodicalId":213320,"journal":{"name":"2021 IEEE International Conference on Web Services (ICWS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multiple Features Driven Author Name Disambiguation\",\"authors\":\"Qiang Zhou, Wei Chen, Weiqing Wang, Jiajie Xu, Lei Zhao\",\"doi\":\"10.1109/ICWS53863.2021.00071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Author Name Disambiguation (AND) has received more attention recently, accompanied by the increase of academic publications. To tackle the AND problem, existing studies have proposed many approaches based on different types of information, such as raw document feature (e.g., co-author, title, and keywords), fusion feature (e.g., a hybrid publication embedding based on raw document feature), local structural information (e.g., a publication's neighborhood information on a graph), and global structural information (e.g., the interactive information between a node and others on a graph). However, there has been no work taking all the above-mentioned information into account for the AND problem so far. To fill the gap, we propose a novel framework namely MFAND (Multiple Features Driven Author Name Disambiguation). Specifically, we first employ the raw document and fusion feature to construct six similarity graphs for each author name to be disambiguated. Next, the global and local structural information extracted from these graphs is fed into a novel encoder called R3JG, which integrates and reconstructs the above-mentioned four types of information associated with an author, with the goal of learning the latent information to enhance the generalization ability of the MFAND. Then, the integrated and reconstructed information is fed into a binary classification model for disambiguation. Note that, several pruning strategies are applied before the information extraction to remove noise effectively. Finally, our proposed framework is investigated on two real-world datasets, and the experimental results show that MFAND performs better than all state-of-the-art methods.\",\"PeriodicalId\":213320,\"journal\":{\"name\":\"2021 IEEE International Conference on Web Services (ICWS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Web Services (ICWS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWS53863.2021.00071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS53863.2021.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiple Features Driven Author Name Disambiguation
Author Name Disambiguation (AND) has received more attention recently, accompanied by the increase of academic publications. To tackle the AND problem, existing studies have proposed many approaches based on different types of information, such as raw document feature (e.g., co-author, title, and keywords), fusion feature (e.g., a hybrid publication embedding based on raw document feature), local structural information (e.g., a publication's neighborhood information on a graph), and global structural information (e.g., the interactive information between a node and others on a graph). However, there has been no work taking all the above-mentioned information into account for the AND problem so far. To fill the gap, we propose a novel framework namely MFAND (Multiple Features Driven Author Name Disambiguation). Specifically, we first employ the raw document and fusion feature to construct six similarity graphs for each author name to be disambiguated. Next, the global and local structural information extracted from these graphs is fed into a novel encoder called R3JG, which integrates and reconstructs the above-mentioned four types of information associated with an author, with the goal of learning the latent information to enhance the generalization ability of the MFAND. Then, the integrated and reconstructed information is fed into a binary classification model for disambiguation. Note that, several pruning strategies are applied before the information extraction to remove noise effectively. Finally, our proposed framework is investigated on two real-world datasets, and the experimental results show that MFAND performs better than all state-of-the-art methods.