Predictable by publication: discovery of early highly cited academic papers based on their own features

IF 3.4 3区管理学 0 INFORMATION SCIENCE & LIBRARY SCIENCE

Library Hi Tech Pub Date : 2023-02-06 DOI:10.1108/lht-06-2022-0305

Xiaobo Tang, Heshen Zhou, Shixuan Li

{"title":"Predictable by publication: discovery of early highly cited academic papers based on their own features","authors":"Xiaobo Tang, Heshen Zhou, Shixuan Li","doi":"10.1108/lht-06-2022-0305","DOIUrl":null,"url":null,"abstract":"PurposePredicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.Design/methodology/approachThis research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.FindingsExperimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.Originality/valueBased on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.","PeriodicalId":47196,"journal":{"name":"Library Hi Tech","volume":" ","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2023-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Library Hi Tech","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1108/lht-06-2022-0305","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 1

Abstract

PurposePredicting highly cited papers can enable an evaluation of the potential of papers and the early detection and determination of academic achievement value. However, most highly cited paper prediction studies consider early citation information, so predicting highly cited papers by publication is challenging. Therefore, the authors propose a method for predicting early highly cited papers based on their own features.Design/methodology/approachThis research analyzed academic papers published in the Journal of the Association for Computing Machinery (ACM) from 2000 to 2013. Five types of features were extracted: paper features, journal features, author features, reference features and semantic features. Subsequently, the authors applied a deep neural network (DNN), support vector machine (SVM), decision tree (DT) and logistic regression (LGR), and they predicted highly cited papers 1–3 years after publication.FindingsExperimental results showed that early highly cited academic papers are predictable when they are first published. The authors’ prediction models showed considerable performance. This study further confirmed that the features of references and authors play an important role in predicting early highly cited papers. In addition, the proportion of high-quality journal references has a more significant impact on prediction.Originality/valueBased on the available information at the time of publication, this study proposed an effective early highly cited paper prediction model. This study facilitates the early discovery and realization of the value of scientific and technological achievements.

查看原文本刊更多论文

发表预测:根据自身特点发现早期高被引学术论文

目的预测高引用论文可以评估论文的潜力，并尽早发现和确定学术成果的价值。然而，大多数高引用论文预测研究都考虑了早期引用信息，因此通过发表来预测高引用论文是具有挑战性的。因此，作者提出了一种基于自身特点预测早期高引用论文的方法。设计/方法论/方法本研究分析了2000年至2013年发表在《计算机械协会杂志》上的学术论文。提取了五类特征：论文特征、期刊特征、作者特征、参考文献特征和语义特征。随后，作者应用了深度神经网络（DNN）、支持向量机（SVM）、决策树（DT）和逻辑回归（LGR），并在发表1-3年后预测了高引用论文。实验结果表明，早期被高度引用的学术论文在首次发表时是可以预测的。作者的预测模型显示出相当的性能。这项研究进一步证实了参考文献和作者的特征在预测早期高引用论文方面发挥着重要作用。此外，高质量期刊参考文献的比例对预测的影响更为显著。原创性/价值基于发表时的可用信息，本研究提出了一个有效的早期高引用论文预测模型。这项研究有助于尽早发现和实现科技成果的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Library Hi Tech INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

8.30

自引率

44.10%

发文量

期刊介绍： ■Integrated library systems ■Networking ■Strategic planning ■Policy implementation across entire institutions ■Security ■Automation systems ■The role of consortia ■Resource access initiatives ■Architecture and technology ■Electronic publishing ■Library technology in specific countries ■User perspectives on technology ■How technology can help disabled library users ■Library-related web sites