{"title":"Computing Inter-Rater Reliability for Observational Data: An Overview and Tutorial.","authors":"Kevin A Hallgren","doi":"10.20982/tqmp.08.1.p023","DOIUrl":null,"url":null,"abstract":"<p><p>Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen's kappa and intra-class correlations to assess IRR.</p>","PeriodicalId":45805,"journal":{"name":"Quantitative Methods for Psychology","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/pdf/nihms372951.pdf","citationCount":"3199","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Methods for Psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20982/tqmp.08.1.p023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 3199
Abstract
Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen's kappa and intra-class correlations to assess IRR.