Early prediction of scholar popularity

Masoumeh Nezhadbiglari, Marcos André Gonçalves, J. Almeida
{"title":"Early prediction of scholar popularity","authors":"Masoumeh Nezhadbiglari, Marcos André Gonçalves, J. Almeida","doi":"10.1145/2910896.2910905","DOIUrl":null,"url":null,"abstract":"Prediction of scholar popularity has become an important research topic for a number of reasons. In this paper, we tackle the problem of predicting the popularity trend of scholars by concentrating on making predictions both as earlier and accurate as possible. In order to perform the prediction task, we first extract the popularity trends of scholars from a training set. To that end, we apply a time series clustering algorithm called K-Spectral Clustering (K-SC) to identify the popularity trends as cluster centroids. We then predict trends for scholars in a test set by solving a classification problem. Specifically, we first compute a set of measures for individual scholars based on the distance between earlier points in her particular popularity curve and the identified centroids. We then combine those distance measures with a set of academic features (e.g., number of publications, number of venues, etc) collected during the same monitoring period, and use them as input to a classification method. One aspect that distinguishes our method from other approaches is that the monitoring period, during which we gather information on each scholar popularity and academic features, is determined on a per scholar basis, as part of our approach. Using total citation count as measure of scientific popularity, we evaluate our solution on the popularity time series of more than 500,000 Computer Science scholars, gathered from Microsoft Azure Mar-ketplace1. The experimental results show that the our prediction method outperforms other alternative prediction methods. We also show how to apply our method jointly with regression models to improve the prediction of scholar popularity values (e.g., number of citations) at a given future time.","PeriodicalId":109613,"journal":{"name":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2910896.2910905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Prediction of scholar popularity has become an important research topic for a number of reasons. In this paper, we tackle the problem of predicting the popularity trend of scholars by concentrating on making predictions both as earlier and accurate as possible. In order to perform the prediction task, we first extract the popularity trends of scholars from a training set. To that end, we apply a time series clustering algorithm called K-Spectral Clustering (K-SC) to identify the popularity trends as cluster centroids. We then predict trends for scholars in a test set by solving a classification problem. Specifically, we first compute a set of measures for individual scholars based on the distance between earlier points in her particular popularity curve and the identified centroids. We then combine those distance measures with a set of academic features (e.g., number of publications, number of venues, etc) collected during the same monitoring period, and use them as input to a classification method. One aspect that distinguishes our method from other approaches is that the monitoring period, during which we gather information on each scholar popularity and academic features, is determined on a per scholar basis, as part of our approach. Using total citation count as measure of scientific popularity, we evaluate our solution on the popularity time series of more than 500,000 Computer Science scholars, gathered from Microsoft Azure Mar-ketplace1. The experimental results show that the our prediction method outperforms other alternative prediction methods. We also show how to apply our method jointly with regression models to improve the prediction of scholar popularity values (e.g., number of citations) at a given future time.
学者人气的早期预测
由于种种原因,学者人气预测已成为一个重要的研究课题。在本文中,我们通过尽可能早和准确地预测学者的流行趋势来解决预测问题。为了完成预测任务,我们首先从训练集中提取学者的流行趋势。为此,我们应用一种称为k -谱聚类(K-SC)的时间序列聚类算法来识别流行趋势作为聚类质心。然后,我们通过解决分类问题来预测测试集中学者的趋势。具体地说,我们首先计算了一组针对个别学者的测量,这些测量是基于她的特定人气曲线上较早的点与已识别的质心之间的距离。然后,我们将这些距离测量与同一监测期间收集的一组学术特征(例如,出版物数量,场地数量等)结合起来,并将它们作为分类方法的输入。我们的方法区别于其他方法的一个方面是,作为我们方法的一部分,监测期是根据每个学者确定的,在此期间,我们收集每位学者的受欢迎程度和学术特征的信息。使用总引用计数作为科学普及的衡量标准,我们根据来自Microsoft Azure market -ketplace1的50多万计算机科学学者的普及时间序列来评估我们的解决方案。实验结果表明,该预测方法优于其他预测方法。我们还展示了如何将我们的方法与回归模型联合应用,以提高对未来给定时间的学者人气值(例如,引用次数)的预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信