Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review.

IF 7.2 Q1 ETHICS
Mohammad Hosseini, Serge P J M Horbach
{"title":"Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review.","authors":"Mohammad Hosseini,&nbsp;Serge P J M Horbach","doi":"10.1186/s41073-023-00133-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The emergence of systems based on large language models (LLMs) such as OpenAI's ChatGPT has created a range of discussions in scholarly circles. Since LLMs generate grammatically correct and mostly relevant (yet sometimes outright wrong, irrelevant or biased) outputs in response to provided prompts, using them in various writing tasks including writing peer review reports could result in improved productivity. Given the significance of peer reviews in the existing scholarly publication landscape, exploring challenges and opportunities of using LLMs in peer review seems urgent. After the generation of the first scholarly outputs with LLMs, we anticipate that peer review reports too would be generated with the help of these systems. However, there are currently no guidelines on how these systems should be used in review tasks.</p><p><strong>Methods: </strong>To investigate the potential impact of using LLMs on the peer review process, we used five core themes within discussions about peer review suggested by Tennant and Ross-Hellauer. These include 1) reviewers' role, 2) editors' role, 3) functions and quality of peer reviews, 4) reproducibility, and 5) the social and epistemic functions of peer reviews. We provide a small-scale exploration of ChatGPT's performance regarding identified issues.</p><p><strong>Results: </strong>LLMs have the potential to substantially alter the role of both peer reviewers and editors. Through supporting both actors in efficiently writing constructive reports or decision letters, LLMs can facilitate higher quality review and address issues of review shortage. However, the fundamental opacity of LLMs' training data, inner workings, data handling, and development processes raise concerns about potential biases, confidentiality and the reproducibility of review reports. Additionally, as editorial work has a prominent function in defining and shaping epistemic communities, as well as negotiating normative frameworks within such communities, partly outsourcing this work to LLMs might have unforeseen consequences for social and epistemic relations within academia. Regarding performance, we identified major enhancements in a short period and expect LLMs to continue developing.</p><p><strong>Conclusions: </strong>We believe that LLMs are likely to have a profound impact on academia and scholarly communication. While potentially beneficial to the scholarly communication system, many uncertainties remain and their use is not without risks. In particular, concerns about the amplification of existing biases and inequalities in access to appropriate infrastructure warrant further attention. For the moment, we recommend that if LLMs are used to write scholarly reviews and decision letters, reviewers and editors should disclose their use and accept full responsibility for data security and confidentiality, and their reports' accuracy, tone, reasoning and originality.</p>","PeriodicalId":74682,"journal":{"name":"Research integrity and peer review","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191680/pdf/","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research integrity and peer review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41073-023-00133-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ETHICS","Score":null,"Total":0}
引用次数: 43

Abstract

Background: The emergence of systems based on large language models (LLMs) such as OpenAI's ChatGPT has created a range of discussions in scholarly circles. Since LLMs generate grammatically correct and mostly relevant (yet sometimes outright wrong, irrelevant or biased) outputs in response to provided prompts, using them in various writing tasks including writing peer review reports could result in improved productivity. Given the significance of peer reviews in the existing scholarly publication landscape, exploring challenges and opportunities of using LLMs in peer review seems urgent. After the generation of the first scholarly outputs with LLMs, we anticipate that peer review reports too would be generated with the help of these systems. However, there are currently no guidelines on how these systems should be used in review tasks.

Methods: To investigate the potential impact of using LLMs on the peer review process, we used five core themes within discussions about peer review suggested by Tennant and Ross-Hellauer. These include 1) reviewers' role, 2) editors' role, 3) functions and quality of peer reviews, 4) reproducibility, and 5) the social and epistemic functions of peer reviews. We provide a small-scale exploration of ChatGPT's performance regarding identified issues.

Results: LLMs have the potential to substantially alter the role of both peer reviewers and editors. Through supporting both actors in efficiently writing constructive reports or decision letters, LLMs can facilitate higher quality review and address issues of review shortage. However, the fundamental opacity of LLMs' training data, inner workings, data handling, and development processes raise concerns about potential biases, confidentiality and the reproducibility of review reports. Additionally, as editorial work has a prominent function in defining and shaping epistemic communities, as well as negotiating normative frameworks within such communities, partly outsourcing this work to LLMs might have unforeseen consequences for social and epistemic relations within academia. Regarding performance, we identified major enhancements in a short period and expect LLMs to continue developing.

Conclusions: We believe that LLMs are likely to have a profound impact on academia and scholarly communication. While potentially beneficial to the scholarly communication system, many uncertainties remain and their use is not without risks. In particular, concerns about the amplification of existing biases and inequalities in access to appropriate infrastructure warrant further attention. For the moment, we recommend that if LLMs are used to write scholarly reviews and decision letters, reviewers and editors should disclose their use and accept full responsibility for data security and confidentiality, and their reports' accuracy, tone, reasoning and originality.

对抗审稿人疲劳还是放大偏见?在学术同行评审中使用ChatGPT和其他大型语言模型的注意事项和建议。
背景:基于大型语言模型(llm)的系统的出现,如OpenAI的ChatGPT,在学术界引起了一系列的讨论。由于法学硕士根据提供的提示生成语法正确且大多数相关(但有时完全错误,不相关或有偏见)的输出,因此在各种写作任务中使用它们,包括撰写同行评审报告,可以提高生产率。鉴于同行评议在现有学术出版领域的重要性,探索在同行评议中使用法学硕士的挑战和机遇似乎迫在眉睫。在法学硕士产生第一批学术成果后,我们预计同行评议报告也将在这些系统的帮助下生成。然而,目前还没有关于如何在审查任务中使用这些系统的指导方针。方法:为了研究使用法学硕士对同行评议过程的潜在影响,我们在Tennant和Ross-Hellauer提出的关于同行评议的讨论中使用了五个核心主题。这些包括1)审稿人的角色,2)编辑的角色,3)同行评议的功能和质量,4)可重复性,以及5)同行评议的社会和认知功能。我们针对已确定的问题提供了ChatGPT性能的小规模探索。结果:法学硕士有可能极大地改变同行审稿人和编辑的角色。通过支持双方有效地撰写建设性的报告或决策函,法学硕士可以促进更高质量的审查并解决审查不足的问题。然而,法学硕士的培训数据、内部工作、数据处理和发展过程的基本不透明引起了对审查报告的潜在偏见、保密性和可重复性的担忧。此外,由于编辑工作在定义和塑造知识社区以及在这些社区内协商规范框架方面具有突出的功能,部分将这项工作外包给法学硕士可能会对学术界的社会和知识关系产生不可预见的后果。在性能方面,我们确定了短期内的主要改进,并期望llm继续发展。结论:我们认为法学硕士可能会对学术界和学术交流产生深远的影响。虽然对学术交流系统有潜在的好处,但仍存在许多不确定性,使用它们并非没有风险。尤其值得进一步注意的是,在获得适当基础设施方面扩大现有偏见和不平等现象。目前,我们建议,如果法学硕士被用于撰写学术评论和决定函,审稿人和编辑应披露其使用情况,并对数据安全和保密,以及报告的准确性,语气,推理和独创性承担全部责任。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
5 weeks
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信