{"title":"Author name disambiguation using vector space model and hybrid similarity measures","authors":"T. Arif, R. Ali, M. Asger","doi":"10.1109/IC3.2014.6897162","DOIUrl":null,"url":null,"abstract":"Differentiating people on the basis of their names has always been a complex issue and our desire for grouping people, in a particular domain, based on their attributes is growing day by day. Despite years of research and a bunch of proposed techniques, the name ambiguity problem remains largely unsolved and the so far proposed techniques have faced one problem or the other. In case of author name disambiguation in digital citations, additional attributes like e-mail ID and affiliation of author and co-authors, which are normally available in publications, can help a lot in disambiguation process. Vector space model has traditionally been used in information retrieval field with great degree of success and we explore its use in case of author name disambiguation here. In this paper we propose an enhanced vector space model for disambiguating authors and their publications. Experimental results show that additional attributes present in publications can help a lot in disambiguation and solve the name ambiguity problem with a great degree of confidence. From the study we conducted and the experimental results obtained we conclude that both mixed citation and split citations problem can be handled very efficiently. We obtained a great deal of improvement in evaluation metrics obtaining F1 score of 0.97.","PeriodicalId":444918,"journal":{"name":"2014 Seventh International Conference on Contemporary Computing (IC3)","volume":"1 12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Seventh International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2014.6897162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
Differentiating people on the basis of their names has always been a complex issue and our desire for grouping people, in a particular domain, based on their attributes is growing day by day. Despite years of research and a bunch of proposed techniques, the name ambiguity problem remains largely unsolved and the so far proposed techniques have faced one problem or the other. In case of author name disambiguation in digital citations, additional attributes like e-mail ID and affiliation of author and co-authors, which are normally available in publications, can help a lot in disambiguation process. Vector space model has traditionally been used in information retrieval field with great degree of success and we explore its use in case of author name disambiguation here. In this paper we propose an enhanced vector space model for disambiguating authors and their publications. Experimental results show that additional attributes present in publications can help a lot in disambiguation and solve the name ambiguity problem with a great degree of confidence. From the study we conducted and the experimental results obtained we conclude that both mixed citation and split citations problem can be handled very efficiently. We obtained a great deal of improvement in evaluation metrics obtaining F1 score of 0.97.