Measuring Agreement Using Guessing Models and Knowledge Coefficients.

IF 2.9 2区心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Psychometrika Pub Date : 2023-09-01 Epub Date: 2023-06-08 DOI:10.1007/s11336-023-09919-4

Jonas Moss

引用次数: 0

Abstract

Several measures of agreement, such as the Perreault-Leigh coefficient, the [Formula: see text], and the recent coefficient of van Oest, are based on explicit models of how judges make their ratings. To handle such measures of agreement under a common umbrella, we propose a class of models called guessing models, which contains most models of how judges make their ratings. Every guessing model have an associated measure of agreement we call the knowledge coefficient. Under certain assumptions on the guessing models, the knowledge coefficient will be equal to the multi-rater Cohen's kappa, Fleiss' kappa, the Brennan-Prediger coefficient, or other less-established measures of agreement. We provide several sample estimators of the knowledge coefficient, valid under varying assumptions, and their asymptotic distributions. After a sensitivity analysis and a simulation study of confidence intervals, we find that the Brennan-Prediger coefficient typically outperforms the others, with much better coverage under unfavorable circumstances.

查看原文本刊更多论文

使用猜测模型和知识系数衡量一致性。

有几种测量一致性的方法，如 Perreault-Leigh 系数、[公式：见正文]，以及最近的 van Oest 系数，都是基于法官如何评分的明确模型。为了在一个共同的框架下处理这些一致度量，我们提出了一类称为猜测模型的模型，它包含了大多数关于评委如何评分的模型。每个猜测模型都有一个相关的一致度量，我们称之为知识系数。在猜测模型的某些假设条件下，知识系数将等同于多评委科恩卡帕系数、弗莱斯卡帕系数、布伦南-珀雷迪格系数或其他不太成熟的一致性测量方法。我们提供了几个在不同假设条件下有效的知识系数样本估计值及其渐近分布。经过敏感性分析和置信区间模拟研究，我们发现布伦南-珀雷迪格系数通常优于其他系数，而且在不利情况下的覆盖率更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Psychometrika 数学-数学跨学科应用

CiteScore

4.40

自引率

10.00%

发文量

审稿时长

>12 weeks

期刊介绍： The journal Psychometrika is devoted to the advancement of theory and methodology for behavioral data in psychology, education and the social and behavioral sciences generally. Its coverage is offered in two sections: Theory and Methods (T& M), and Application Reviews and Case Studies (ARCS). T&M articles present original research and reviews on the development of quantitative models, statistical methods, and mathematical techniques for evaluating data from psychology, the social and behavioral sciences and related fields. Application Reviews can be integrative, drawing together disparate methodologies for applications, or comparative and evaluative, discussing advantages and disadvantages of one or more methodologies in applications. Case Studies highlight methodology that deepens understanding of substantive phenomena through more informative data analysis, or more elegant data description.