1 .利用机器学习预测未来的引文数量

2021 22nd International Arab Conference on Information Technology (ACIT) Pub Date : 2021-12-21 DOI:10.1109/acit53391.2021.9677228

Khalid Mansour, Essam Al-Daoud, Baha Al-Karaky

{"title":"1 .利用机器学习预测未来的引文数量","authors":"Khalid Mansour, Essam Al-Daoud, Baha Al-Karaky","doi":"10.1109/acit53391.2021.9677228","DOIUrl":null,"url":null,"abstract":"The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.","PeriodicalId":302120,"journal":{"name":"2021 22nd International Arab Conference on Information Technology (ACIT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"I. Predicting Future Citation Counts Using Machine Learning\",\"authors\":\"Khalid Mansour, Essam Al-Daoud, Baha Al-Karaky\",\"doi\":\"10.1109/acit53391.2021.9677228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.\",\"PeriodicalId\":302120,\"journal\":{\"name\":\"2021 22nd International Arab Conference on Information Technology (ACIT)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 22nd International Arab Conference on Information Technology (ACIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/acit53391.2021.9677228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 22nd International Arab Conference on Information Technology (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/acit53391.2021.9677228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

研究论文的被引次数是科学研究界的一项重要标准，也是作者和所属机构的重要质量指标。随着研究人员、大学和科研机构的增加，向科学期刊提交的论文数量大幅增加。因此，处理提交的论文所需的时间也增加了。这导致了高质量论文的延迟发表。本文提出了一种使用机器学习算法来预测未来提交的研究论文被引用的方法，以加快预测的高质量提交论文的处理时间。我们的实验使用了来自国际阿拉伯信息技术杂志(IAJIT)的数据集。这个数据集涵盖了过去十年的数据。在数据集上训练了16种机器学习算法。产生了两组结果。第一个与预测过程中使用的特征的相对重要性有关。第二组显示了16种机器学习算法的结果，用于预测提交论文的未来引用。实验结果表明，参考文献的数量是最重要的特征，而论文作者的数量是最不重要的特征。此外，神经网络和投票分类器1技术产生的结果在预测未来引用方面略好于其他技术。接下来是Naïve基地。其他使用的机器学习方法也表现出类似的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

I. Predicting Future Citation Counts Using Machine Learning

The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 22nd International Arab Conference on Information Technology (ACIT)

自引率

0.00%

发文量