{"title":"1 .利用机器学习预测未来的引文数量","authors":"Khalid Mansour, Essam Al-Daoud, Baha Al-Karaky","doi":"10.1109/acit53391.2021.9677228","DOIUrl":null,"url":null,"abstract":"The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.","PeriodicalId":302120,"journal":{"name":"2021 22nd International Arab Conference on Information Technology (ACIT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"I. Predicting Future Citation Counts Using Machine Learning\",\"authors\":\"Khalid Mansour, Essam Al-Daoud, Baha Al-Karaky\",\"doi\":\"10.1109/acit53391.2021.9677228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.\",\"PeriodicalId\":302120,\"journal\":{\"name\":\"2021 22nd International Arab Conference on Information Technology (ACIT)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 22nd International Arab Conference on Information Technology (ACIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/acit53391.2021.9677228\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 22nd International Arab Conference on Information Technology (ACIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/acit53391.2021.9677228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
I. Predicting Future Citation Counts Using Machine Learning
The number of citations for a research paper is an important criterion in the scientific research community and is also an important quality indicator for both the author and the affiliated institution. Since the number of researchers, universities and scientific institutes are increasing, the number of submitted papers to scientific journals increased substantially. Consequently, the time needed to process the submitted papers is increased as well. This results in delaying publishing high quality papers. This paper proposes an approach to predict the future citations of submitted research papers using machine learning algorithms to speed up the processing time of the predicted high quality submitted papers. A dataset solicited from the International Arab Journal of Information Technology (IAJIT) is used in our experiments. The dataset covers the last ten years. Sixteen machine learning algorithms are trained on the dataset. Two sets of results are produced. The first one is related to the relative importance of features used in the prediction process. The second set shows the results of the sixteen machine learning algorithms used to predict future citations of submitted papers. The experimental results show that the number of references is the most important feature while the number of authors shown on papers is the least important feature. In addition, the results produced by neural network and voting classifier 1 techniques are slightly better than other techniques in predicting future citations. Naïve base comes next. The rest of the used machine learning methods show similar performance.