Bayesian and frequentist statistical models to predict publishing output and article processing charge totals

IF 2.8 2区 管理学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Philip M. Dixon, Eric Schares
{"title":"Bayesian and frequentist statistical models to predict publishing output and article processing charge totals","authors":"Philip M. Dixon,&nbsp;Eric Schares","doi":"10.1002/asi.24981","DOIUrl":null,"url":null,"abstract":"<p>Academic libraries, institutions, and publishers are interested in predicting future publishing output to help evaluate publishing agreements. Current predictive models are overly simplistic and provide inaccurate predictions. This paper presents Bayesian and frequentist statistical models to predict future article counts and costs. These models use the past year's counts of corresponding authored peer-reviewed articles to predict the distribution of the number of articles in a future year. Article counts for each journal and year are modeled as a log-linear function of year with journal-specific coefficients. Journal-specific predictions are summed to predict the distribution of total paper count and combined with journal-specific costs to predict the distribution of total cost. We fit models to three data sets: 366 Wiley journals for 2016–2020, 376 Springer-Nature journals from 2017 to 2021, and 313 Wiley journals from 2017 to 2021. For each dataset, we compared predictions for the subsequent year to actual counts. The model predicts two datasets better than using either the annual mean count or a linear trend regression. For the third, no method predicts output well. A Bayesian model provides prediction uncertainties that account for all modeled sources of uncertainty. Better estimates of future publishing activity and costs provide critical, independent information for open publishing negotiations.</p>","PeriodicalId":48810,"journal":{"name":"Journal of the Association for Information Science and Technology","volume":"76 6","pages":"917-932"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asi.24981","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Association for Information Science and Technology","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asi.24981","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Academic libraries, institutions, and publishers are interested in predicting future publishing output to help evaluate publishing agreements. Current predictive models are overly simplistic and provide inaccurate predictions. This paper presents Bayesian and frequentist statistical models to predict future article counts and costs. These models use the past year's counts of corresponding authored peer-reviewed articles to predict the distribution of the number of articles in a future year. Article counts for each journal and year are modeled as a log-linear function of year with journal-specific coefficients. Journal-specific predictions are summed to predict the distribution of total paper count and combined with journal-specific costs to predict the distribution of total cost. We fit models to three data sets: 366 Wiley journals for 2016–2020, 376 Springer-Nature journals from 2017 to 2021, and 313 Wiley journals from 2017 to 2021. For each dataset, we compared predictions for the subsequent year to actual counts. The model predicts two datasets better than using either the annual mean count or a linear trend regression. For the third, no method predicts output well. A Bayesian model provides prediction uncertainties that account for all modeled sources of uncertainty. Better estimates of future publishing activity and costs provide critical, independent information for open publishing negotiations.

Abstract Image

贝叶斯和频率统计模型预测出版输出和文章处理费用总额
学术图书馆、机构和出版商都有兴趣预测未来的出版产出,以帮助评估出版协议。目前的预测模型过于简单,预测不准确。本文提出了贝叶斯和频率统计模型来预测未来的文章数量和成本。这些模型使用过去一年相应作者的同行评议文章的数量来预测未来一年文章数量的分布。每个期刊和年份的文章数被建模为具有期刊特定系数的年份的对数线性函数。对特定期刊的预测进行汇总以预测总论文数的分布,并与期刊特定成本相结合以预测总成本的分布。我们将模型拟合到三个数据集:2016-2020年的366种Wiley期刊,2017 - 2021年的376种施普林格-自然期刊,以及2017 - 2021年的313种Wiley期刊。对于每个数据集,我们将对次年的预测与实际数量进行比较。该模型比使用年平均计数或线性趋势回归更好地预测两个数据集。对于第三种情况,没有任何方法能很好地预测产出。贝叶斯模型提供的预测不确定性可以解释所有建模的不确定性来源。对未来出版活动和成本的更好估计为公开出版谈判提供了关键的、独立的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.30
自引率
8.60%
发文量
115
期刊介绍: The Journal of the Association for Information Science and Technology (JASIST) is a leading international forum for peer-reviewed research in information science. For more than half a century, JASIST has provided intellectual leadership by publishing original research that focuses on the production, discovery, recording, storage, representation, retrieval, presentation, manipulation, dissemination, use, and evaluation of information and on the tools and techniques associated with these processes. The Journal welcomes rigorous work of an empirical, experimental, ethnographic, conceptual, historical, socio-technical, policy-analytic, or critical-theoretical nature. JASIST also commissions in-depth review articles (“Advances in Information Science”) and reviews of print and other media.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信