重新思考同行评议，使用瑞士奶酪模型更好地标记有问题的手稿

IF 2.4 3区管理学 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE

Learned Publishing Pub Date : 2025-08-18 DOI:10.1002/leap.2021

Jennifer A. Byrne, Anna Abalkina, Jana Christopher, Marie F. Soulière

{"title":"重新思考同行评议，使用瑞士奶酪模型更好地标记有问题的手稿","authors":"Jennifer A. Byrne, Anna Abalkina, Jana Christopher, Marie F. Soulière","doi":"10.1002/leap.2021","DOIUrl":null,"url":null,"abstract":"Peer review represents a cornerstone of scientific and scholarly publishing. Despite many unanswered questions about the value of peer review (Tennant and Ross-Hellauer 2020), it is widely assumed that peer review improves the quality of published articles. In turn, reviewers accept peer review requests based on the interest and relevance of manuscripts, and as a service to their fields (Severin and Chataway 2021). It has been estimated that reviewers contributed over 100 million hours towards peer review in 2020, with an approximate collective value of over USD 2 billion (Aczel et al. 2021). Millions of peer reviewers participate in peer review every year (Aczel et al. 2021) through invitations from the many thousands of editors who support peer reviewed journals.Peer review occurs through a stepwise process (Figure 1), where manuscripts are first considered by editors, with a subset of manuscripts progressing to external peer review. These stages involve different individuals with often differing expertise and hence capacities to assess research quality. Peer review has, however, paid less attention to detecting manuscripts with questionable integrity. This could reflect assumptions that relatively few manuscripts present integrity issues, and that peer reviewers may be poorly equipped to detect integrity issues within manuscripts (Stroebe et al. 2012).The rise in manuscripts from paper mills (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Van Noorden 2023) and the possible misapplication of large language models (LLM's) to generate and scale fast-churn manuscripts (Grimaldi and Ehrler 2023; Suchak et al. 2025) highlight the need to revisit assumptions about the capacity of peer review to detect unethical or low-value submissions. Suspected submissions from paper mills have now been described by many biomedical journals (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Seifert 2021). Other journals have described receiving large numbers of repetitive manuscripts (Jin 2022), suggesting the possible misapplication of LLM's by paper mills or individual teams (Mainous III 2025; Munafò et al. 2024; Stewart 2025). Such changing patterns of submissions (Mainous III 2025) and, in some cases, publications (Stender et al. 2024; Suchak et al. 2025) question assumptions that unethical manuscript submissions are infrequent. As submissions from paper mills and the use of LLM's to produce derivative manuscripts become more frequent, the awareness of scaled and repetitive submissions is also expected to grow (Byrne and Stender 2025). This awareness can be leveraged through peer review processes that actively consider research and scholarly integrity in addition to quality (Abalkina et al. 2025).There are further reasons to focus on peer review to counter submissions from paper mills. In contrast to genuine research, where the rate-limiting step towards publication is likely to be conducting the research, the rate-limiting step for paper mills is likely to be peer review (Byrne et al. 2022). It is, therefore, not surprising that paper mills attempt to manipulate this rate-limiting step (Byrne et al. 2022; Matusz et al. 2025), for example, by recruiting editors and peer reviewers to facilitate manuscript acceptance (Abalkina et al. 2025; Joelving and Retraction Watch 2024; Pinna et al. 2020), and by creating fake reviewer identities (Matusz et al. 2025). Given the critical and rate-limiting importance of peer review to paper mills, their manuscripts and submission tactics are likely to continually evolve to evade detection (Byrne et al. 2024), highlighting the need for peer review processes to keep pace (Abalkina et al. 2025).The scale and changing nature of the paper mill problem highlights the value of rethinking peer review according to the Swiss Cheese Model of error and accident causation and prevention (Larouzee and Le Coze 2020; Wiegmann et al. 2022). The Swiss Cheese Model represents processes as different slices of cheese, where strengths are represented by the cheese, and weaknesses or failings are represented by holes in the cheese (Figure 1). The Swiss Cheese Model proposes that errors or accidents can occur when blind spots at different process stages are shared, and that errors and accidents can be prevented when processes have complementary areas of strength (Wiegmann et al. 2022) (Figure 1).When applied to peer review, the Swiss Cheese Model allows each stage of the peer review process to be recognised as a new opportunity to detect problematic manuscripts (Figure 1). Ideally, most problematic submissions will be detected at the early stages of peer review, with fewer problematic submissions progressing to the later, more resource-intensive stages (Figure 1). Nonetheless, if problematic manuscript features can be communicated back to editors, this can tighten some ‘holes’ through which problematic manuscripts could otherwise slip. While no peer review system can successfully detect all submissions from bad actors (Cooke et al. 2024; Wykes and Parkinson 2023), continuous, real-time improvements could reduce the numbers of accepted manuscripts that, with the benefit of hindsight, should have been rejected.Other authors have recently described the features of paper mill manuscripts that can be detected by editors and peer reviewers, and how the peer review system can better respond (Abalkina et al. 2025; Christopher 2021). We will largely not repeat previous advice but instead focus on opportunities that have not been previously discussed. In all cases, we will describe simple modifications of existing practices that could be easily implemented (Table 1). We will give some final considerations to the possible consequences of implementing these recommended changes at scale, and how some publisher and journal practices might need to adapt as a result.Editors make final decisions as to whether manuscripts will be accepted or rejected; therefore, they play critical roles in peer review (Sever 2023; Tennant, and Ross-Hellauer 2020). Many journals have editorial assistants or in-house teams checking submissions and making decisions on initial desk rejections (Horbach and Halffman 2020). These early decisions determine whether and how manuscripts progress to the later stages of peer review, and how much time and other resources will be assigned to individual manuscripts. It is in both editors' and peer reviewers' interests that editorial review identifies low-value and potentially dishonest submissions as effectively as possible to reduce the resources that could otherwise be wasted through further consideration (Cooke et al. 2024; Sever 2023; Wykes and Parkinson 2023).To detect problematic submissions at the editorial review stage, publishers and journals are increasingly investing in research integrity teams, manuscript screening tools and using known features of problematic submissions to highlight manuscripts for desk rejection (Abalkina et al. 2025; Alam and Wilson 2023). However, despite the best efforts of journals and publishers, not all problematic submissions will be detected through editorial review (Behl 2021; Bricker-Anthony and Giangrande 2022) (Figure 1). First, screening tools may not be available to all journals, particularly where individual journals need to pay for access. Second, most screening tools represent commercial products where the targeted features may not be disclosed, and practical understanding of the output could be limited. This could lead to uncertainties over the value of predictions and the decisions that editorial staff should take in response. Third, in-house teams and screening tools are unlikely to recognise all paper mill or mass-produced manuscript features, given our currently limited knowledge of the extent of the paper mill problem, and the predicted capacity of paper mill submissions to evolve in response to improved detection (Byrne et al. 2024). Finally, awareness of problematic submissions is likely to vary between individual editors and journals. For example, some editors might simply be unaware of paper mills or assume that their journal will not be targeted. Similarly, editors might not recognise features of problematic submissions, particularly if their journal has not been previously (or knowingly) targeted or receives few submissions.If a manuscript is not desk-rejected, editors will invite experts to review the manuscript, with the aim of securing 2–3 reviewers as quickly as possible (Sever 2023). While the process of inviting reviewers might not always be viewed as a separate stage of peer review, many journals are likely to be spending more time and resources issuing peer reviewer invitations, as more invitations are declined (Chataway and Severin 2021; Fox et al. 2017; Sever 2023). Difficulties in identifying peer reviewers could be exacerbated through a growing reluctance to approach author-suggested reviewers, in case these individuals are linked to paper mills (Wittau and Seifert 2024).Once a reviewer has accepted a peer review request, they will usually supply their review through the journal's editorial platform. This involves providing comments to authors, optional confidential comments to the editor, and answering a range of questions about the manuscript and themselves as reviewers (Dine et al. 2015).Many revised manuscripts undergo further peer review, particularly if many revisions are requested (Sever 2023). Editorial and peer review of revised manuscripts therefore provide another opportunity to detect and flag integrity concerns that may have been missed earlier (Figure 1). For example, manuscript revision provides an opportunity for editors to request supporting raw data that can sometimes reveal new integrity concerns (Christopher 2021).We recognise that taking a more adversarial approach to peer review could have unintended consequences. A proportion of integrity concerns could be unfounded; and so desk rejecting manuscripts based on integrity concerns could disadvantage some genuine research and scholarship. This is an important consideration, particularly given the higher stakes that peer review can represent for junior scholars (Meibauer et al. 2025).High rates of desk rejections could also impose additional stress on editors that could adversely impact decision making (Bazerman and Sezer 2016; Wykes and Parkinson 2023). High desk rejection rates could be particularly challenging for smaller or new journals, where editors may be encouraged to send more submissions for peer review. Rapid desk rejections could also allow quick resubmissions of paper mill manuscripts to other journals. These factors highlight the importance of considered desk rejections, where the timeliness of editorial decisions may be less important (Wykes and Parkinson 2023).Encouraging confidential comments to editors could also provide opportunities for reviewers to make vexatious claims to disadvantage competitors (Swanson 2023). While this represents a possibility, it is important to recognise that rival reviewers have other opportunities to disadvantage competitors through their comments to authors. Some journals allow authors to exclude reviewers from peer review to reduce the possibility of vexatious claims or delays. From our experience as both editors and peer reviewers, single reviewer descriptions of integrity concerns seem unlikely to determine editorial decisions. However, peer review processes that allow manuscripts to be (i) declined for peer review due to possible integrity concerns and (ii) receive peer review comments on similar issues could be less susceptible to vexatious claims.Many journal editors have described rising numbers of problematic submissions that are increasingly overwhelming peer review resources and leading to rising numbers of low-value publications. The Swiss Cheese Model allows peer review to be reconsidered from an error detection and accident prevention perspective, where each peer review stage can be considered a new opportunity to detect problematic submissions. The changes that we have suggested could allow journal editors and peer reviewers to more effectively flag problematic submissions, producing a more resilient peer review system and contributing towards a more evidence-based understanding of peer review. More widespread use of best practices and a willingness to constantly update peer review processes to emerging threats (Abalkina et al. 2025; Netzer 2024) could allow better use of valuable peer review resources and reduce unethical publications from paper mills.Jennifer A. Byrne: conceptualization, writing – original draft, writing – review and editing. Anna Abalkina: writing – review and editing, writing – original draft. Jana Christopher: writing – original draft, writing – review and editing. Marie F. Soulière: writing – original draft, writing – review and editing.All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution.M.F.S. is employed by Frontiers Media SA. All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution. The other authors declare no conflicts of interest.","PeriodicalId":51636,"journal":{"name":"Learned Publishing","volume":"38 4","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/leap.2021","citationCount":"0","resultStr":"{\"title\":\"Rethinking Peer Review Using the Swiss Cheese Model to Better Flag Problematic Manuscripts\",\"authors\":\"Jennifer A. Byrne, Anna Abalkina, Jana Christopher, Marie F. Soulière\",\"doi\":\"10.1002/leap.2021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Peer review represents a cornerstone of scientific and scholarly publishing. Despite many unanswered questions about the value of peer review (Tennant and Ross-Hellauer 2020), it is widely assumed that peer review improves the quality of published articles. In turn, reviewers accept peer review requests based on the interest and relevance of manuscripts, and as a service to their fields (Severin and Chataway 2021). It has been estimated that reviewers contributed over 100 million hours towards peer review in 2020, with an approximate collective value of over USD 2 billion (Aczel et al. 2021). Millions of peer reviewers participate in peer review every year (Aczel et al. 2021) through invitations from the many thousands of editors who support peer reviewed journals.Peer review occurs through a stepwise process (Figure 1), where manuscripts are first considered by editors, with a subset of manuscripts progressing to external peer review. These stages involve different individuals with often differing expertise and hence capacities to assess research quality. Peer review has, however, paid less attention to detecting manuscripts with questionable integrity. This could reflect assumptions that relatively few manuscripts present integrity issues, and that peer reviewers may be poorly equipped to detect integrity issues within manuscripts (Stroebe et al. 2012).The rise in manuscripts from paper mills (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Van Noorden 2023) and the possible misapplication of large language models (LLM's) to generate and scale fast-churn manuscripts (Grimaldi and Ehrler 2023; Suchak et al. 2025) highlight the need to revisit assumptions about the capacity of peer review to detect unethical or low-value submissions. Suspected submissions from paper mills have now been described by many biomedical journals (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Seifert 2021). Other journals have described receiving large numbers of repetitive manuscripts (Jin 2022), suggesting the possible misapplication of LLM's by paper mills or individual teams (Mainous III 2025; Munafò et al. 2024; Stewart 2025). Such changing patterns of submissions (Mainous III 2025) and, in some cases, publications (Stender et al. 2024; Suchak et al. 2025) question assumptions that unethical manuscript submissions are infrequent. As submissions from paper mills and the use of LLM's to produce derivative manuscripts become more frequent, the awareness of scaled and repetitive submissions is also expected to grow (Byrne and Stender 2025). This awareness can be leveraged through peer review processes that actively consider research and scholarly integrity in addition to quality (Abalkina et al. 2025).There are further reasons to focus on peer review to counter submissions from paper mills. In contrast to genuine research, where the rate-limiting step towards publication is likely to be conducting the research, the rate-limiting step for paper mills is likely to be peer review (Byrne et al. 2022). It is, therefore, not surprising that paper mills attempt to manipulate this rate-limiting step (Byrne et al. 2022; Matusz et al. 2025), for example, by recruiting editors and peer reviewers to facilitate manuscript acceptance (Abalkina et al. 2025; Joelving and Retraction Watch 2024; Pinna et al. 2020), and by creating fake reviewer identities (Matusz et al. 2025). Given the critical and rate-limiting importance of peer review to paper mills, their manuscripts and submission tactics are likely to continually evolve to evade detection (Byrne et al. 2024), highlighting the need for peer review processes to keep pace (Abalkina et al. 2025).The scale and changing nature of the paper mill problem highlights the value of rethinking peer review according to the Swiss Cheese Model of error and accident causation and prevention (Larouzee and Le Coze 2020; Wiegmann et al. 2022). The Swiss Cheese Model represents processes as different slices of cheese, where strengths are represented by the cheese, and weaknesses or failings are represented by holes in the cheese (Figure 1). The Swiss Cheese Model proposes that errors or accidents can occur when blind spots at different process stages are shared, and that errors and accidents can be prevented when processes have complementary areas of strength (Wiegmann et al. 2022) (Figure 1).When applied to peer review, the Swiss Cheese Model allows each stage of the peer review process to be recognised as a new opportunity to detect problematic manuscripts (Figure 1). Ideally, most problematic submissions will be detected at the early stages of peer review, with fewer problematic submissions progressing to the later, more resource-intensive stages (Figure 1). Nonetheless, if problematic manuscript features can be communicated back to editors, this can tighten some ‘holes’ through which problematic manuscripts could otherwise slip. While no peer review system can successfully detect all submissions from bad actors (Cooke et al. 2024; Wykes and Parkinson 2023), continuous, real-time improvements could reduce the numbers of accepted manuscripts that, with the benefit of hindsight, should have been rejected.Other authors have recently described the features of paper mill manuscripts that can be detected by editors and peer reviewers, and how the peer review system can better respond (Abalkina et al. 2025; Christopher 2021). We will largely not repeat previous advice but instead focus on opportunities that have not been previously discussed. In all cases, we will describe simple modifications of existing practices that could be easily implemented (Table 1). We will give some final considerations to the possible consequences of implementing these recommended changes at scale, and how some publisher and journal practices might need to adapt as a result.Editors make final decisions as to whether manuscripts will be accepted or rejected; therefore, they play critical roles in peer review (Sever 2023; Tennant, and Ross-Hellauer 2020). Many journals have editorial assistants or in-house teams checking submissions and making decisions on initial desk rejections (Horbach and Halffman 2020). These early decisions determine whether and how manuscripts progress to the later stages of peer review, and how much time and other resources will be assigned to individual manuscripts. It is in both editors' and peer reviewers' interests that editorial review identifies low-value and potentially dishonest submissions as effectively as possible to reduce the resources that could otherwise be wasted through further consideration (Cooke et al. 2024; Sever 2023; Wykes and Parkinson 2023).To detect problematic submissions at the editorial review stage, publishers and journals are increasingly investing in research integrity teams, manuscript screening tools and using known features of problematic submissions to highlight manuscripts for desk rejection (Abalkina et al. 2025; Alam and Wilson 2023). However, despite the best efforts of journals and publishers, not all problematic submissions will be detected through editorial review (Behl 2021; Bricker-Anthony and Giangrande 2022) (Figure 1). First, screening tools may not be available to all journals, particularly where individual journals need to pay for access. Second, most screening tools represent commercial products where the targeted features may not be disclosed, and practical understanding of the output could be limited. This could lead to uncertainties over the value of predictions and the decisions that editorial staff should take in response. Third, in-house teams and screening tools are unlikely to recognise all paper mill or mass-produced manuscript features, given our currently limited knowledge of the extent of the paper mill problem, and the predicted capacity of paper mill submissions to evolve in response to improved detection (Byrne et al. 2024). Finally, awareness of problematic submissions is likely to vary between individual editors and journals. For example, some editors might simply be unaware of paper mills or assume that their journal will not be targeted. Similarly, editors might not recognise features of problematic submissions, particularly if their journal has not been previously (or knowingly) targeted or receives few submissions.If a manuscript is not desk-rejected, editors will invite experts to review the manuscript, with the aim of securing 2–3 reviewers as quickly as possible (Sever 2023). While the process of inviting reviewers might not always be viewed as a separate stage of peer review, many journals are likely to be spending more time and resources issuing peer reviewer invitations, as more invitations are declined (Chataway and Severin 2021; Fox et al. 2017; Sever 2023). Difficulties in identifying peer reviewers could be exacerbated through a growing reluctance to approach author-suggested reviewers, in case these individuals are linked to paper mills (Wittau and Seifert 2024).Once a reviewer has accepted a peer review request, they will usually supply their review through the journal's editorial platform. This involves providing comments to authors, optional confidential comments to the editor, and answering a range of questions about the manuscript and themselves as reviewers (Dine et al. 2015).Many revised manuscripts undergo further peer review, particularly if many revisions are requested (Sever 2023). Editorial and peer review of revised manuscripts therefore provide another opportunity to detect and flag integrity concerns that may have been missed earlier (Figure 1). For example, manuscript revision provides an opportunity for editors to request supporting raw data that can sometimes reveal new integrity concerns (Christopher 2021).We recognise that taking a more adversarial approach to peer review could have unintended consequences. A proportion of integrity concerns could be unfounded; and so desk rejecting manuscripts based on integrity concerns could disadvantage some genuine research and scholarship. This is an important consideration, particularly given the higher stakes that peer review can represent for junior scholars (Meibauer et al. 2025).High rates of desk rejections could also impose additional stress on editors that could adversely impact decision making (Bazerman and Sezer 2016; Wykes and Parkinson 2023). High desk rejection rates could be particularly challenging for smaller or new journals, where editors may be encouraged to send more submissions for peer review. Rapid desk rejections could also allow quick resubmissions of paper mill manuscripts to other journals. These factors highlight the importance of considered desk rejections, where the timeliness of editorial decisions may be less important (Wykes and Parkinson 2023).Encouraging confidential comments to editors could also provide opportunities for reviewers to make vexatious claims to disadvantage competitors (Swanson 2023). While this represents a possibility, it is important to recognise that rival reviewers have other opportunities to disadvantage competitors through their comments to authors. Some journals allow authors to exclude reviewers from peer review to reduce the possibility of vexatious claims or delays. From our experience as both editors and peer reviewers, single reviewer descriptions of integrity concerns seem unlikely to determine editorial decisions. However, peer review processes that allow manuscripts to be (i) declined for peer review due to possible integrity concerns and (ii) receive peer review comments on similar issues could be less susceptible to vexatious claims.Many journal editors have described rising numbers of problematic submissions that are increasingly overwhelming peer review resources and leading to rising numbers of low-value publications. The Swiss Cheese Model allows peer review to be reconsidered from an error detection and accident prevention perspective, where each peer review stage can be considered a new opportunity to detect problematic submissions. The changes that we have suggested could allow journal editors and peer reviewers to more effectively flag problematic submissions, producing a more resilient peer review system and contributing towards a more evidence-based understanding of peer review. More widespread use of best practices and a willingness to constantly update peer review processes to emerging threats (Abalkina et al. 2025; Netzer 2024) could allow better use of valuable peer review resources and reduce unethical publications from paper mills.Jennifer A. Byrne: conceptualization, writing – original draft, writing – review and editing. Anna Abalkina: writing – review and editing, writing – original draft. Jana Christopher: writing – original draft, writing – review and editing. Marie F. Soulière: writing – original draft, writing – review and editing.All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution.M.F.S. is employed by Frontiers Media SA. All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution. The other authors declare no conflicts of interest.\",\"PeriodicalId\":51636,\"journal\":{\"name\":\"Learned Publishing\",\"volume\":\"38 4\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/leap.2021\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Learned Publishing\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/leap.2021\",\"RegionNum\":3,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learned Publishing","FirstCategoryId":"91","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/leap.2021","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

同行评议是科学和学术出版的基石。尽管关于同行评议的价值有许多悬而未决的问题（Tennant和Ross-Hellauer 2020），但人们普遍认为同行评议可以提高发表文章的质量。反过来，审稿人根据手稿的兴趣和相关性接受同行评审请求，并作为对其领域的服务（Severin和Chataway 2021）。据估计，审稿人在2020年为同行评审贡献了超过1亿小时，总价值约超过20亿美元（Aczel et al. 2021）。每年有数百万同行评议人通过支持同行评议期刊的数千名编辑的邀请参与同行评议（Aczel et al. 2021）。同行评审是通过一个循序渐进的过程进行的（图1），其中手稿首先由编辑考虑，然后一部分手稿进行外部同行评审。这些阶段涉及不同的人，他们通常具有不同的专业知识，因此评估研究质量的能力也不同。然而，同行评议却很少注意检测论文的完整性是否有问题。这可能反映了这样的假设，即相对较少的手稿存在完整性问题，同行审稿人可能没有足够的能力来检测手稿中的完整性问题（Stroebe et al. 2012）。造纸厂手稿的增加(Behl 2021；布里克-安东尼和詹格兰德2022；Cooper and Han 2021；Heck et al. 2021；Pinna et al. 2020；Van Noorden 2023)以及大型语言模型（LLM）在生成和扩展快速流动手稿方面可能存在的误用(Grimaldi和Ehrler 2023；Suchak等人（2025）强调需要重新审视同行评审能力的假设，以发现不道德或低价值的提交。许多生物医学期刊现已描述了造纸厂提交的可疑材料(Behl 2021；布里克-安东尼和詹格兰德2022；Cooper and Han 2021；Heck et al. 2021；Pinna et al. 2020；塞弗特2021年)。其他期刊称收到了大量重复稿件（Jin 2022），这表明造纸厂或个别团队可能误用法学硕士学位(Mainous III 2025；Munafò等人2024；斯图尔特2025)。这种提交模式的变化（Mainous III 2025），在某些情况下，出版物(Stender et al. 2024；Suchak等人（2025）质疑不道德的稿件提交很少发生的假设。随着造纸厂的投稿和法学硕士制作衍生文稿的使用变得越来越频繁，预计规模和重复投稿的意识也会增强（Byrne和Stender 2025）。这种意识可以通过同行评审过程来利用，除了质量之外，还积极考虑研究和学术诚信（Abalkina et al. 2025）。还有更多的理由来关注同行评议，以反对造纸厂的意见书。与真正的研究相反，出版的限速步骤可能是进行研究，造纸厂的限速步骤可能是同行评审（Byrne et al. 2022）。因此，造纸厂试图操纵这一限制速度的步骤并不奇怪(Byrne等人，2022；Matusz et al. 2025)，例如，通过招募编辑和同行审稿人来促进稿件的接受(Abalkina et al. 2025；Joelving和缩回手表2024；Pinna et al. 2020)，并通过创建虚假的审稿人身份（Matusz et al. 2025）。鉴于同行评议对造纸厂的关键和限制速度的重要性，他们的手稿和提交策略可能会不断发展以逃避检测（Byrne等人，2024），突出了同行评议过程跟上步伐的必要性（Abalkina等人，2025）。造纸厂问题的规模和不断变化的性质突出了根据错误和事故因果和预防的瑞士奶酪模型(Larouzee和Le Coze 2020；Wiegmann et al. 2022)。瑞士奶酪模型将过程表示为不同的奶酪片，其中优势由奶酪表示，缺点或失败由奶酪上的洞表示（图1）。瑞士奶酪模型提出，当不同过程阶段的盲点共享时，可能会发生错误或事故，而当过程具有互补的优势时，可以防止错误和事故（Wiegmann et al. 2022）（图1）。当应用于同行评审时，瑞士奶酪模型允许将同行评审过程的每个阶段视为检测有问题手稿的新机会（图1）。理想情况下，大多数有问题的提交将在同行评审的早期阶段被检测到，只有较少的有问题的提交进展到后期，资源更密集的阶段（图1）。尽管如此，如果有问题的手稿特征可以反馈给编辑，这可以收紧一些“漏洞”，否则有问题的手稿可能会滑出去。虽然没有同行评议系统可以成功地检测到来自不良行为者的所有提交(Cooke et al. 2024；Wykes和Parkinson 2023)，持续的、实时的改进可以减少被接受的手稿数量，事后看来，这些手稿应该被拒绝。其他作者最近描述了编辑和同行评议者可以检测到的造纸厂手稿的特征，以及同行评议系统如何更好地做出反应(Abalkina等人，2025；克里斯托弗·2021)。我们基本上不会重复以前的建议，而是将重点放在以前没有讨论过的机会上。在所有情况下，我们将描述对现有实践的简单修改，这些修改可以很容易地实现（表1）。我们将最后考虑大规模实施这些建议的变化可能带来的后果，以及一些出版商和期刊的做法可能需要如何适应。编辑最终决定稿件是被接受还是被拒绝；因此，他们在同行评议中发挥着关键作用(Sever 2023；Tennant和Ross-Hellauer 2020)。许多期刊都有编辑助理或内部团队检查投稿，并对最初的办公桌拒绝做出决定（Horbach and Halffman 2020）。这些早期的决定决定了稿件是否以及如何进入同行评审的后期阶段，以及将多少时间和其他资源分配给单个稿件。编辑审查尽可能有效地识别低价值和潜在不诚实的投稿，以减少可能通过进一步考虑而浪费的资源，符合编辑和同行审稿人的利益(Cooke et al. 2024；切断2023;Wykes and Parkinson 2023)。为了在编辑审查阶段发现有问题的投稿，出版商和期刊越来越多地投资于研究诚信团队、稿件筛选工具，并利用有问题投稿的已知特征来突出稿件的退稿(Abalkina et al. 2025；阿拉姆和威尔逊2023)。然而，尽管期刊和出版商尽了最大的努力，但并不是所有有问题的投稿都能通过编辑审查发现(Behl 2021；Bricker-Anthony and Giangrande 2022)（图1）。首先，筛选工具可能并不适用于所有期刊，特别是在个别期刊需要付费访问的情况下。其次，大多数筛选工具代表商业产品，其中目标特征可能不会公开，并且对输出的实际理解可能有限。这可能导致预测价值的不确定性，以及编辑人员应对预测做出的决定。第三，内部团队和筛选工具不太可能识别所有造纸厂或批量生产的手稿特征，因为我们目前对造纸厂问题的程度了解有限，而且造纸厂提交的预测能力会随着检测的改进而变化（Byrne et al. 2024）。最后，个别编辑和期刊对问题投稿的认识可能有所不同。例如，一些编辑可能根本不知道造纸厂，或者认为他们的期刊不会成为攻击目标。同样，编辑可能无法识别有问题的投稿的特征，特别是如果他们的期刊以前没有（或故意）成为目标或收到很少的投稿。如果稿件未被退稿，编辑将邀请专家对稿件进行审稿，目的是尽快确定2-3名审稿人（Sever 2023）。虽然邀请审稿人的过程可能并不总是被视为同行评议的一个单独阶段，但随着越来越多的邀请被拒绝，许多期刊可能会花费更多的时间和资源来发出同行评议邀请(Chataway和Severin 2021；Fox et al. 2017；切断2023)。如果这些人与造纸厂有联系，越来越不愿意接触作者推荐的审稿人，可能会加剧识别同行审稿人的困难（Wittau和Seifert 2024）。一旦审稿人接受了同行评审请求，他们通常会通过期刊的编辑平台提供他们的审稿。这包括向作者提供评论，向编辑提供可选的机密评论，并回答一系列关于手稿和自己作为审稿人的问题（Dine et al. 2015）。许多修改后的手稿都要经过进一步的同行评审，特别是如果需要进行许多修改（Sever 2023）。因此，编辑和同行审稿提供了另一个机会来检测和标记先前可能错过的完整性问题（图1）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Rethinking Peer Review Using the Swiss Cheese Model to Better Flag Problematic Manuscripts

查看原文本刊更多论文

Rethinking Peer Review Using the Swiss Cheese Model to Better Flag Problematic Manuscripts

Peer review represents a cornerstone of scientific and scholarly publishing. Despite many unanswered questions about the value of peer review (Tennant and Ross-Hellauer 2020), it is widely assumed that peer review improves the quality of published articles. In turn, reviewers accept peer review requests based on the interest and relevance of manuscripts, and as a service to their fields (Severin and Chataway 2021). It has been estimated that reviewers contributed over 100 million hours towards peer review in 2020, with an approximate collective value of over USD 2 billion (Aczel et al. 2021). Millions of peer reviewers participate in peer review every year (Aczel et al. 2021) through invitations from the many thousands of editors who support peer reviewed journals.

Peer review occurs through a stepwise process (Figure 1), where manuscripts are first considered by editors, with a subset of manuscripts progressing to external peer review. These stages involve different individuals with often differing expertise and hence capacities to assess research quality. Peer review has, however, paid less attention to detecting manuscripts with questionable integrity. This could reflect assumptions that relatively few manuscripts present integrity issues, and that peer reviewers may be poorly equipped to detect integrity issues within manuscripts (Stroebe et al. 2012).

The rise in manuscripts from paper mills (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Van Noorden 2023) and the possible misapplication of large language models (LLM's) to generate and scale fast-churn manuscripts (Grimaldi and Ehrler 2023; Suchak et al. 2025) highlight the need to revisit assumptions about the capacity of peer review to detect unethical or low-value submissions. Suspected submissions from paper mills have now been described by many biomedical journals (Behl 2021; Bricker-Anthony and Giangrande 2022; Cooper and Han 2021; Heck et al. 2021; Pinna et al. 2020; Seifert 2021). Other journals have described receiving large numbers of repetitive manuscripts (Jin 2022), suggesting the possible misapplication of LLM's by paper mills or individual teams (Mainous III 2025; Munafò et al. 2024; Stewart 2025). Such changing patterns of submissions (Mainous III 2025) and, in some cases, publications (Stender et al. 2024; Suchak et al. 2025) question assumptions that unethical manuscript submissions are infrequent. As submissions from paper mills and the use of LLM's to produce derivative manuscripts become more frequent, the awareness of scaled and repetitive submissions is also expected to grow (Byrne and Stender 2025). This awareness can be leveraged through peer review processes that actively consider research and scholarly integrity in addition to quality (Abalkina et al. 2025).

There are further reasons to focus on peer review to counter submissions from paper mills. In contrast to genuine research, where the rate-limiting step towards publication is likely to be conducting the research, the rate-limiting step for paper mills is likely to be peer review (Byrne et al. 2022). It is, therefore, not surprising that paper mills attempt to manipulate this rate-limiting step (Byrne et al. 2022; Matusz et al. 2025), for example, by recruiting editors and peer reviewers to facilitate manuscript acceptance (Abalkina et al. 2025; Joelving and Retraction Watch 2024; Pinna et al. 2020), and by creating fake reviewer identities (Matusz et al. 2025). Given the critical and rate-limiting importance of peer review to paper mills, their manuscripts and submission tactics are likely to continually evolve to evade detection (Byrne et al. 2024), highlighting the need for peer review processes to keep pace (Abalkina et al. 2025).

The scale and changing nature of the paper mill problem highlights the value of rethinking peer review according to the Swiss Cheese Model of error and accident causation and prevention (Larouzee and Le Coze 2020; Wiegmann et al. 2022). The Swiss Cheese Model represents processes as different slices of cheese, where strengths are represented by the cheese, and weaknesses or failings are represented by holes in the cheese (Figure 1). The Swiss Cheese Model proposes that errors or accidents can occur when blind spots at different process stages are shared, and that errors and accidents can be prevented when processes have complementary areas of strength (Wiegmann et al. 2022) (Figure 1).

When applied to peer review, the Swiss Cheese Model allows each stage of the peer review process to be recognised as a new opportunity to detect problematic manuscripts (Figure 1). Ideally, most problematic submissions will be detected at the early stages of peer review, with fewer problematic submissions progressing to the later, more resource-intensive stages (Figure 1). Nonetheless, if problematic manuscript features can be communicated back to editors, this can tighten some ‘holes’ through which problematic manuscripts could otherwise slip. While no peer review system can successfully detect all submissions from bad actors (Cooke et al. 2024; Wykes and Parkinson 2023), continuous, real-time improvements could reduce the numbers of accepted manuscripts that, with the benefit of hindsight, should have been rejected.

Other authors have recently described the features of paper mill manuscripts that can be detected by editors and peer reviewers, and how the peer review system can better respond (Abalkina et al. 2025; Christopher 2021). We will largely not repeat previous advice but instead focus on opportunities that have not been previously discussed. In all cases, we will describe simple modifications of existing practices that could be easily implemented (Table 1). We will give some final considerations to the possible consequences of implementing these recommended changes at scale, and how some publisher and journal practices might need to adapt as a result.

Editors make final decisions as to whether manuscripts will be accepted or rejected; therefore, they play critical roles in peer review (Sever 2023; Tennant, and Ross-Hellauer 2020). Many journals have editorial assistants or in-house teams checking submissions and making decisions on initial desk rejections (Horbach and Halffman 2020). These early decisions determine whether and how manuscripts progress to the later stages of peer review, and how much time and other resources will be assigned to individual manuscripts. It is in both editors' and peer reviewers' interests that editorial review identifies low-value and potentially dishonest submissions as effectively as possible to reduce the resources that could otherwise be wasted through further consideration (Cooke et al. 2024; Sever 2023; Wykes and Parkinson 2023).

To detect problematic submissions at the editorial review stage, publishers and journals are increasingly investing in research integrity teams, manuscript screening tools and using known features of problematic submissions to highlight manuscripts for desk rejection (Abalkina et al. 2025; Alam and Wilson 2023). However, despite the best efforts of journals and publishers, not all problematic submissions will be detected through editorial review (Behl 2021; Bricker-Anthony and Giangrande 2022) (Figure 1). First, screening tools may not be available to all journals, particularly where individual journals need to pay for access. Second, most screening tools represent commercial products where the targeted features may not be disclosed, and practical understanding of the output could be limited. This could lead to uncertainties over the value of predictions and the decisions that editorial staff should take in response. Third, in-house teams and screening tools are unlikely to recognise all paper mill or mass-produced manuscript features, given our currently limited knowledge of the extent of the paper mill problem, and the predicted capacity of paper mill submissions to evolve in response to improved detection (Byrne et al. 2024). Finally, awareness of problematic submissions is likely to vary between individual editors and journals. For example, some editors might simply be unaware of paper mills or assume that their journal will not be targeted. Similarly, editors might not recognise features of problematic submissions, particularly if their journal has not been previously (or knowingly) targeted or receives few submissions.

If a manuscript is not desk-rejected, editors will invite experts to review the manuscript, with the aim of securing 2–3 reviewers as quickly as possible (Sever 2023). While the process of inviting reviewers might not always be viewed as a separate stage of peer review, many journals are likely to be spending more time and resources issuing peer reviewer invitations, as more invitations are declined (Chataway and Severin 2021; Fox et al. 2017; Sever 2023). Difficulties in identifying peer reviewers could be exacerbated through a growing reluctance to approach author-suggested reviewers, in case these individuals are linked to paper mills (Wittau and Seifert 2024).

Once a reviewer has accepted a peer review request, they will usually supply their review through the journal's editorial platform. This involves providing comments to authors, optional confidential comments to the editor, and answering a range of questions about the manuscript and themselves as reviewers (Dine et al. 2015).

Many revised manuscripts undergo further peer review, particularly if many revisions are requested (Sever 2023). Editorial and peer review of revised manuscripts therefore provide another opportunity to detect and flag integrity concerns that may have been missed earlier (Figure 1). For example, manuscript revision provides an opportunity for editors to request supporting raw data that can sometimes reveal new integrity concerns (Christopher 2021).

We recognise that taking a more adversarial approach to peer review could have unintended consequences. A proportion of integrity concerns could be unfounded; and so desk rejecting manuscripts based on integrity concerns could disadvantage some genuine research and scholarship. This is an important consideration, particularly given the higher stakes that peer review can represent for junior scholars (Meibauer et al. 2025).

High rates of desk rejections could also impose additional stress on editors that could adversely impact decision making (Bazerman and Sezer 2016; Wykes and Parkinson 2023). High desk rejection rates could be particularly challenging for smaller or new journals, where editors may be encouraged to send more submissions for peer review. Rapid desk rejections could also allow quick resubmissions of paper mill manuscripts to other journals. These factors highlight the importance of considered desk rejections, where the timeliness of editorial decisions may be less important (Wykes and Parkinson 2023).

Encouraging confidential comments to editors could also provide opportunities for reviewers to make vexatious claims to disadvantage competitors (Swanson 2023). While this represents a possibility, it is important to recognise that rival reviewers have other opportunities to disadvantage competitors through their comments to authors. Some journals allow authors to exclude reviewers from peer review to reduce the possibility of vexatious claims or delays. From our experience as both editors and peer reviewers, single reviewer descriptions of integrity concerns seem unlikely to determine editorial decisions. However, peer review processes that allow manuscripts to be (i) declined for peer review due to possible integrity concerns and (ii) receive peer review comments on similar issues could be less susceptible to vexatious claims.

Many journal editors have described rising numbers of problematic submissions that are increasingly overwhelming peer review resources and leading to rising numbers of low-value publications. The Swiss Cheese Model allows peer review to be reconsidered from an error detection and accident prevention perspective, where each peer review stage can be considered a new opportunity to detect problematic submissions. The changes that we have suggested could allow journal editors and peer reviewers to more effectively flag problematic submissions, producing a more resilient peer review system and contributing towards a more evidence-based understanding of peer review. More widespread use of best practices and a willingness to constantly update peer review processes to emerging threats (Abalkina et al. 2025; Netzer 2024) could allow better use of valuable peer review resources and reduce unethical publications from paper mills.

Jennifer A. Byrne: conceptualization, writing – original draft, writing – review and editing. Anna Abalkina: writing – review and editing, writing – original draft. Jana Christopher: writing – original draft, writing – review and editing. Marie F. Soulière: writing – original draft, writing – review and editing.

All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution.

M.F.S. is employed by Frontiers Media SA. All opinions expressed in this commentary represent the views of the authors and do not represent the views of any listed employer or institution. The other authors declare no conflicts of interest.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Learned Publishing INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

4.40

自引率

17.90%

发文量