一种授予评审评估的新方法:评分，然后排名。

IF 10.7 Q1 ETHICS

Research integrity and peer review Pub Date : 2023-07-24 DOI:10.1186/s41073-023-00131-7

Stephen A Gallo, Michael Pearce, Carole J Lee, Elena A Erosheva

{"title":"一种授予评审评估的新方法:评分，然后排名。","authors":"Stephen A Gallo, Michael Pearce, Carole J Lee, Elena A Erosheva","doi":"10.1186/s41073-023-00131-7","DOIUrl":null,"url":null,"abstract":"Background: In many grant review settings, proposals are selected for funding on the basis of summary statistics of review ratings. Challenges of this approach (including the presence of ties and unclear ordering of funding preference for proposals) could be mitigated if rankings such as top-k preferences or paired comparisons, which are local evaluations that enforce ordering across proposals, were also collected and incorporated in the analysis of review ratings. However, analyzing ratings and rankings simultaneously has not been done until recently. This paper describes a practical method for integrating rankings and scores and demonstrates its usefulness for making funding decisions in real-world applications.Methods: We first present the application of our existing joint model for rankings and ratings, the Mallows-Binomial, in obtaining an integrated score for each proposal and generating the induced preference ordering. We then apply this methodology to several theoretical \"toy\" examples of rating and ranking data, designed to demonstrate specific properties of the model. We then describe an innovative protocol for collecting rankings of the top-six proposals as an add-on to the typical peer review scoring procedures and provide a case study using actual peer review data to exemplify the output and how the model can appropriately resolve judges' evaluations.Results: For the theoretical examples, we show how the model can provide a preference order to equally rated proposals by incorporating rankings, to proposals using ratings and only partial rankings (and how they differ from a ratings-only approach) and to proposals where judges provide internally inconsistent ratings/rankings and outlier scoring. Finally, we discuss how, using real world panel data, this method can provide information about funding priority with a level of accuracy in a well-suited format for research funding decisions.Conclusions: A methodology is provided to collect and employ both rating and ranking data in peer review assessments of proposal submission quality, highlighting several advantages over methods relying on ratings alone. This method leverages information to most accurately distill reviewer opinion into a useful output to make an informed funding decision and is general enough to be applied to settings such as in the NIH panel review process.","PeriodicalId":74682,"journal":{"name":"Research integrity and peer review","volume":"8 1","pages":"10"},"PeriodicalIF":10.7000,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10367367/pdf/","citationCount":"0","resultStr":"{\"title\":\"A new approach to grant review assessments: score, then rank.\",\"authors\":\"Stephen A Gallo, Michael Pearce, Carole J Lee, Elena A Erosheva\",\"doi\":\"10.1186/s41073-023-00131-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: In many grant review settings, proposals are selected for funding on the basis of summary statistics of review ratings. Challenges of this approach (including the presence of ties and unclear ordering of funding preference for proposals) could be mitigated if rankings such as top-k preferences or paired comparisons, which are local evaluations that enforce ordering across proposals, were also collected and incorporated in the analysis of review ratings. However, analyzing ratings and rankings simultaneously has not been done until recently. This paper describes a practical method for integrating rankings and scores and demonstrates its usefulness for making funding decisions in real-world applications.Methods: We first present the application of our existing joint model for rankings and ratings, the Mallows-Binomial, in obtaining an integrated score for each proposal and generating the induced preference ordering. We then apply this methodology to several theoretical \\\"toy\\\" examples of rating and ranking data, designed to demonstrate specific properties of the model. We then describe an innovative protocol for collecting rankings of the top-six proposals as an add-on to the typical peer review scoring procedures and provide a case study using actual peer review data to exemplify the output and how the model can appropriately resolve judges' evaluations.Results: For the theoretical examples, we show how the model can provide a preference order to equally rated proposals by incorporating rankings, to proposals using ratings and only partial rankings (and how they differ from a ratings-only approach) and to proposals where judges provide internally inconsistent ratings/rankings and outlier scoring. Finally, we discuss how, using real world panel data, this method can provide information about funding priority with a level of accuracy in a well-suited format for research funding decisions.Conclusions: A methodology is provided to collect and employ both rating and ranking data in peer review assessments of proposal submission quality, highlighting several advantages over methods relying on ratings alone. This method leverages information to most accurately distill reviewer opinion into a useful output to make an informed funding decision and is general enough to be applied to settings such as in the NIH panel review process.\",\"PeriodicalId\":74682,\"journal\":{\"name\":\"Research integrity and peer review\",\"volume\":\"8 1\",\"pages\":\"10\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2023-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10367367/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research integrity and peer review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s41073-023-00131-7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ETHICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research integrity and peer review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41073-023-00131-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ETHICS","Score":null,"Total":0}

引用次数: 0

摘要

背景:在许多拨款审查设置中，提案是根据审查评分的汇总统计来选择资助的。如果还收集诸如top-k偏好或配对比较之类的排名，并将其纳入审查评级的分析中，则可以减轻这种方法的挑战(包括存在联系和提案资金偏好的不明确排序)。配对比较是在提案之间强制排序的本地评估。但是，直到最近才同时分析收视率和排名。本文描述了一种整合排名和分数的实用方法，并展示了它在实际应用中做出资助决策的有用性。方法:我们首先介绍了我们现有的排名和评级联合模型，Mallows-Binomial，在获得每个提案的综合得分和生成诱导偏好排序中的应用。然后，我们将这种方法应用于几个评级和排名数据的理论“玩具”示例，旨在展示模型的特定属性。然后，我们描述了一个收集前六名提案排名的创新协议，作为典型同行评议评分程序的附加程序，并提供了一个使用实际同行评议数据的案例研究，以举例说明输出以及该模型如何适当地解决评委的评估。结果:对于理论示例，我们展示了该模型如何通过结合排名来为同等评级的提案提供偏好顺序，如何为使用评级和仅部分排名的提案提供偏好顺序(以及它们与仅评级方法的区别)，以及如何为评委提供内部不一致的评级/排名和异常值评分的提案提供偏好顺序。最后，我们讨论了如何使用真实世界的面板数据，这种方法能够以一种非常适合研究资助决策的格式，以一定程度的准确性提供有关资助优先级的信息。结论:提供了一种方法来收集和使用评级和排名数据在提案提交质量的同行评议评估中，突出了仅依赖评级方法的几个优势。这种方法利用信息，最准确地将审稿人的意见提炼成有用的输出，以做出明智的资助决定，并且足够普遍，可以应用于NIH小组审查过程等设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A new approach to grant review assessments: score, then rank.

查看原文本刊更多论文

A new approach to grant review assessments: score, then rank.

Background: In many grant review settings, proposals are selected for funding on the basis of summary statistics of review ratings. Challenges of this approach (including the presence of ties and unclear ordering of funding preference for proposals) could be mitigated if rankings such as top-k preferences or paired comparisons, which are local evaluations that enforce ordering across proposals, were also collected and incorporated in the analysis of review ratings. However, analyzing ratings and rankings simultaneously has not been done until recently. This paper describes a practical method for integrating rankings and scores and demonstrates its usefulness for making funding decisions in real-world applications.

Methods: We first present the application of our existing joint model for rankings and ratings, the Mallows-Binomial, in obtaining an integrated score for each proposal and generating the induced preference ordering. We then apply this methodology to several theoretical "toy" examples of rating and ranking data, designed to demonstrate specific properties of the model. We then describe an innovative protocol for collecting rankings of the top-six proposals as an add-on to the typical peer review scoring procedures and provide a case study using actual peer review data to exemplify the output and how the model can appropriately resolve judges' evaluations.

Results: For the theoretical examples, we show how the model can provide a preference order to equally rated proposals by incorporating rankings, to proposals using ratings and only partial rankings (and how they differ from a ratings-only approach) and to proposals where judges provide internally inconsistent ratings/rankings and outlier scoring. Finally, we discuss how, using real world panel data, this method can provide information about funding priority with a level of accuracy in a well-suited format for research funding decisions.

Conclusions: A methodology is provided to collect and employ both rating and ranking data in peer review assessments of proposal submission quality, highlighting several advantages over methods relying on ratings alone. This method leverages information to most accurately distill reviewer opinion into a useful output to make an informed funding decision and is general enough to be applied to settings such as in the NIH panel review process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Research integrity and peer review

自引率

0.00%

发文量

审稿时长

5 weeks