{"title":"Measuring Agreement about Ranked Decision Choices for a Single Subject","authors":"R. Riffenburgh, P. Johnstone","doi":"10.2202/1557-4679.1113","DOIUrl":null,"url":null,"abstract":"Introduction. When faced with a medical classification, clinicians often rank-order the likelihood of potential diagnoses, treatment choices, or prognoses as a way to focus on likely occurrences without dropping rarer ones from consideration. To know how well clinicians agree on such rankings might help extend the realm of clinical judgment farther into the purview of evidence-based medicine. If rankings by different clinicians agree better than chance, the order of assignments and their relative likelihoods may justifiably contribute to medical decisions. If the agreement is no better than chance, the ranking should not influence the medical decision.Background. Available rank-order methods measure agreement over a set of decision choices by two rankers or by a set of rankers over two choices (rank correlation methods), or an overall agreement over a set of choices by a set of rankers (Kendall's W), but will not measure agreement about a single decision choice across a set of rankers. Rating methods (e.g. kappa) assign multiple subjects to nominal categories rather than ranking possible choices about a single subject and will not measure agreement about a single decision choice across a set of rankers.Method. In this article, we pose an agreement coefficient A for measuring agreement among a set of clinicians about a single decision choice and compare several potential forms of A. A takes on the value 0 when agreement is random and 1 when agreement is perfect. It is shown that A = 1 - observed disagreement/maximum disagreement. A particular form of A is recommended and tables of 5% and 10% significant values of A are generated for common numbers of ranks and rankers.Examples. In the selection of potential treatment assignments by a Tumor Board to a patient with a neck mass, there is no significant agreement about any treatment. Another example involves ranking decisions about a proposed medical research protocol by an Institutional Review Board (IRB). The decision to pass a protocol with minor revisions shows agreement at the 5% significance level, adequate for a consistent decision.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"47 47 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1113","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.2202/1557-4679.1113","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Introduction. When faced with a medical classification, clinicians often rank-order the likelihood of potential diagnoses, treatment choices, or prognoses as a way to focus on likely occurrences without dropping rarer ones from consideration. To know how well clinicians agree on such rankings might help extend the realm of clinical judgment farther into the purview of evidence-based medicine. If rankings by different clinicians agree better than chance, the order of assignments and their relative likelihoods may justifiably contribute to medical decisions. If the agreement is no better than chance, the ranking should not influence the medical decision.Background. Available rank-order methods measure agreement over a set of decision choices by two rankers or by a set of rankers over two choices (rank correlation methods), or an overall agreement over a set of choices by a set of rankers (Kendall's W), but will not measure agreement about a single decision choice across a set of rankers. Rating methods (e.g. kappa) assign multiple subjects to nominal categories rather than ranking possible choices about a single subject and will not measure agreement about a single decision choice across a set of rankers.Method. In this article, we pose an agreement coefficient A for measuring agreement among a set of clinicians about a single decision choice and compare several potential forms of A. A takes on the value 0 when agreement is random and 1 when agreement is perfect. It is shown that A = 1 - observed disagreement/maximum disagreement. A particular form of A is recommended and tables of 5% and 10% significant values of A are generated for common numbers of ranks and rankers.Examples. In the selection of potential treatment assignments by a Tumor Board to a patient with a neck mass, there is no significant agreement about any treatment. Another example involves ranking decisions about a proposed medical research protocol by an Institutional Review Board (IRB). The decision to pass a protocol with minor revisions shows agreement at the 5% significance level, adequate for a consistent decision.
期刊介绍:
The International Journal of Biostatistics (IJB) seeks to publish new biostatistical models and methods, new statistical theory, as well as original applications of statistical methods, for important practical problems arising from the biological, medical, public health, and agricultural sciences with an emphasis on semiparametric methods. Given many alternatives to publish exist within biostatistics, IJB offers a place to publish for research in biostatistics focusing on modern methods, often based on machine-learning and other data-adaptive methodologies, as well as providing a unique reading experience that compels the author to be explicit about the statistical inference problem addressed by the paper. IJB is intended that the journal cover the entire range of biostatistics, from theoretical advances to relevant and sensible translations of a practical problem into a statistical framework. Electronic publication also allows for data and software code to be appended, and opens the door for reproducible research allowing readers to easily replicate analyses described in a paper. Both original research and review articles will be warmly received, as will articles applying sound statistical methods to practical problems.