Predicting article quality scores with machine learning: The U.K. Research Excellence Framework

IF 4.1 Q1 INFORMATION SCIENCE & LIBRARY SCIENCE
M. Thelwall, K. Kousha, Mahshid Abdoli, E. Stuart, Meiko Makita, Paul Wilson, Jonathan M. Levitt, Petr Knoth, M. Cancellieri
{"title":"Predicting article quality scores with machine learning: The U.K. Research Excellence Framework","authors":"M. Thelwall, K. Kousha, Mahshid Abdoli, E. Stuart, Meiko Makita, Paul Wilson, Jonathan M. Levitt, Petr Knoth, M. Cancellieri","doi":"10.1162/qss_a_00258","DOIUrl":null,"url":null,"abstract":"Abstract National research evaluation initiatives and incentive schemes choose between simplistic quantitative indicators and time-consuming peer/expert review, sometimes supported by bibliometrics. Here we assess whether machine learning could provide a third alternative, estimating article quality using more multiple bibliometric and metadata inputs. We investigated this using provisional three-level REF2021 peer review scores for 84,966 articles submitted to the U.K. Research Excellence Framework 2021, matching a Scopus record 2014–18 and with a substantial abstract. We found that accuracy is highest in the medical and physical sciences Units of Assessment (UoAs) and economics, reaching 42% above the baseline (72% overall) in the best case. This is based on 1,000 bibliometric inputs and half of the articles used for training in each UoA. Prediction accuracies above the baseline for the social science, mathematics, engineering, arts, and humanities UoAs were much lower or close to zero. The Random Forest Classifier (standard or ordinal) and Extreme Gradient Boosting Classifier algorithms performed best from the 32 tested. Accuracy was lower if UoAs were merged or replaced by Scopus broad categories. We increased accuracy with an active learning strategy and by selecting articles with higher prediction probabilities, but this substantially reduced the number of scores predicted.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"547-573"},"PeriodicalIF":4.1000,"publicationDate":"2022-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Science Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/qss_a_00258","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 4

Abstract

Abstract National research evaluation initiatives and incentive schemes choose between simplistic quantitative indicators and time-consuming peer/expert review, sometimes supported by bibliometrics. Here we assess whether machine learning could provide a third alternative, estimating article quality using more multiple bibliometric and metadata inputs. We investigated this using provisional three-level REF2021 peer review scores for 84,966 articles submitted to the U.K. Research Excellence Framework 2021, matching a Scopus record 2014–18 and with a substantial abstract. We found that accuracy is highest in the medical and physical sciences Units of Assessment (UoAs) and economics, reaching 42% above the baseline (72% overall) in the best case. This is based on 1,000 bibliometric inputs and half of the articles used for training in each UoA. Prediction accuracies above the baseline for the social science, mathematics, engineering, arts, and humanities UoAs were much lower or close to zero. The Random Forest Classifier (standard or ordinal) and Extreme Gradient Boosting Classifier algorithms performed best from the 32 tested. Accuracy was lower if UoAs were merged or replaced by Scopus broad categories. We increased accuracy with an active learning strategy and by selecting articles with higher prediction probabilities, but this substantially reduced the number of scores predicted.
用机器学习预测文章质量分数:英国卓越研究框架
摘要国家研究评估举措和激励方案在简单的量化指标和耗时的同行/专家评审之间做出选择,有时还得到文献计量学的支持。在这里,我们评估了机器学习是否可以提供第三种选择,即使用更多的文献计量和元数据输入来估计文章质量。我们对提交给英国卓越研究框架2021的84966篇文章进行了临时三级REF2021同行评审得分调查,与2014-2018年Scopus记录和一篇实质性摘要相匹配。我们发现,医学和物理科学评估单位(UoAs)和经济学的准确率最高,在最佳情况下比基线高出42%(总体高出72%)。这是基于1000个文献计量输入和每个UoA用于培训的一半文章。社会科学、数学、工程、艺术和人文学科UoA高于基线的预测准确率要低得多或接近于零。随机森林分类器(标准或有序)和极限梯度提升分类器算法在32个测试中表现最好。如果UoAs被Scopus大类合并或取代,则准确性较低。我们通过主动学习策略和选择预测概率较高的文章来提高准确性,但这大大减少了预测分数的数量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Quantitative Science Studies
Quantitative Science Studies INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
12.10
自引率
12.50%
发文量
46
审稿时长
22 weeks
期刊介绍:
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信