{"title":"R package to estimate intracluster correlation coefficient for nominal and ordinal data","authors":"Hrishikesh Chakraborty , Nicole Solomon","doi":"10.1016/j.cmpbup.2025.100200","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>: The intracluster correlation coefficient (ICC) is a critical parameter to assess the degree of similarity or correlation between observations within the same cluster or group. It is commonly applied in cluster-randomized trials to estimate average within-cluster correlation. Although methods to estimate ICC exist for binary, continuous, and survival data, a new resampling-based approach has been developed for nominal or ordinal responses with more than two categories. The objective of this paper is to present both the resampling methods estimator and method of moments (MoM) based estimator for categorical ICC estimation. To facilitate the adoption and use of these estimators we developed an R package, <span>iccmult</span>, which calculates the ICC point estimate and confidence interval (CI) for categorical response data under each of these two methods.</div></div><div><h3>Methods</h3><div>: In this paper we incorporated the resampling based estimation method and MoM originally developed to characterize population genetic structure. A simulation study was conducted to compare estimates from MoM to the resampling method under different event rates, varying numbers of clusters, and various cluster sizes. The <span>iccmult</span> package provides two estimates of ICC and its CI, computed using these two methods. Additionally, the package also generates clustered categorical response data.</div></div><div><h3>Results</h3><div>: The <span>iccmult</span> package provides two functions for users. The function <span>rccat()</span> generates clustered categorical data, while the function <span>iccmulti()</span> estimates ICC and its CI. The simulation study revealed that the resampling and MoM methods perform nearly identically in estimating population ICC. However, the MoM method demonstrated greater precision in scenarios with fewer clusters and smaller cluster sizes.</div></div><div><h3>Conclusions</h3><div>: The <span>R</span> package <span>iccmult</span> offers easy-to-use ways to generate clustered categorical data and estimate ICC and its CI for a nominal or ordinal response using different methods. The package is freely available for use with <span>R</span> from the CRAN repository (<span><span>https://cran.r-project.org/package=iccmult</span><svg><path></path></svg></span>). We believe that this package can be a very useful tool for researchers designing cluster randomized trials with a categorical outcome.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100200"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine update","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666990025000242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background and Objective
: The intracluster correlation coefficient (ICC) is a critical parameter to assess the degree of similarity or correlation between observations within the same cluster or group. It is commonly applied in cluster-randomized trials to estimate average within-cluster correlation. Although methods to estimate ICC exist for binary, continuous, and survival data, a new resampling-based approach has been developed for nominal or ordinal responses with more than two categories. The objective of this paper is to present both the resampling methods estimator and method of moments (MoM) based estimator for categorical ICC estimation. To facilitate the adoption and use of these estimators we developed an R package, iccmult, which calculates the ICC point estimate and confidence interval (CI) for categorical response data under each of these two methods.
Methods
: In this paper we incorporated the resampling based estimation method and MoM originally developed to characterize population genetic structure. A simulation study was conducted to compare estimates from MoM to the resampling method under different event rates, varying numbers of clusters, and various cluster sizes. The iccmult package provides two estimates of ICC and its CI, computed using these two methods. Additionally, the package also generates clustered categorical response data.
Results
: The iccmult package provides two functions for users. The function rccat() generates clustered categorical data, while the function iccmulti() estimates ICC and its CI. The simulation study revealed that the resampling and MoM methods perform nearly identically in estimating population ICC. However, the MoM method demonstrated greater precision in scenarios with fewer clusters and smaller cluster sizes.
Conclusions
: The R package iccmult offers easy-to-use ways to generate clustered categorical data and estimate ICC and its CI for a nominal or ordinal response using different methods. The package is freely available for use with R from the CRAN repository (https://cran.r-project.org/package=iccmult). We believe that this package can be a very useful tool for researchers designing cluster randomized trials with a categorical outcome.