{"title":"Predictable by publication: discovery of early highly cited academic papers based on their own features","authors":"Xiaobo Tang, Heshen Zhou, Shixuan Li","doi":"10.1108/lht-06-2022-0305","DOIUrl":null,"url":null,"abstract":"PurposePredicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.Design/methodology/approachThis research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.FindingsExperimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.Originality/valueBased on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.","PeriodicalId":47196,"journal":{"name":"Library Hi Tech","volume":" ","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2023-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Library Hi Tech","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1108/lht-06-2022-0305","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 1
Abstract
PurposePredicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.Design/methodology/approachThis research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.FindingsExperimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.Originality/valueBased on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.
期刊介绍:
■Integrated library systems ■Networking ■Strategic planning ■Policy implementation across entire institutions ■Security ■Automation systems ■The role of consortia ■Resource access initiatives ■Architecture and technology ■Electronic publishing ■Library technology in specific countries ■User perspectives on technology ■How technology can help disabled library users ■Library-related web sites