R package to estimate intracluster correlation coefficient for nominal and ordinal data

Hrishikesh Chakraborty , Nicole Solomon
{"title":"R package to estimate intracluster correlation coefficient for nominal and ordinal data","authors":"Hrishikesh Chakraborty ,&nbsp;Nicole Solomon","doi":"10.1016/j.cmpbup.2025.100200","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>: The intracluster correlation coefficient (ICC) is a critical parameter to assess the degree of similarity or correlation between observations within the same cluster or group. It is commonly applied in cluster-randomized trials to estimate average within-cluster correlation. Although methods to estimate ICC exist for binary, continuous, and survival data, a new resampling-based approach has been developed for nominal or ordinal responses with more than two categories. The objective of this paper is to present both the resampling methods estimator and method of moments (MoM) based estimator for categorical ICC estimation. To facilitate the adoption and use of these estimators we developed an R package, <span>iccmult</span>, which calculates the ICC point estimate and confidence interval (CI) for categorical response data under each of these two methods.</div></div><div><h3>Methods</h3><div>: In this paper we incorporated the resampling based estimation method and MoM originally developed to characterize population genetic structure. A simulation study was conducted to compare estimates from MoM to the resampling method under different event rates, varying numbers of clusters, and various cluster sizes. The <span>iccmult</span> package provides two estimates of ICC and its CI, computed using these two methods. Additionally, the package also generates clustered categorical response data.</div></div><div><h3>Results</h3><div>: The <span>iccmult</span> package provides two functions for users. The function <span>rccat()</span> generates clustered categorical data, while the function <span>iccmulti()</span> estimates ICC and its CI. The simulation study revealed that the resampling and MoM methods perform nearly identically in estimating population ICC. However, the MoM method demonstrated greater precision in scenarios with fewer clusters and smaller cluster sizes.</div></div><div><h3>Conclusions</h3><div>: The <span>R</span> package <span>iccmult</span> offers easy-to-use ways to generate clustered categorical data and estimate ICC and its CI for a nominal or ordinal response using different methods. The package is freely available for use with <span>R</span> from the CRAN repository (<span><span>https://cran.r-project.org/package=iccmult</span><svg><path></path></svg></span>). We believe that this package can be a very useful tool for researchers designing cluster randomized trials with a categorical outcome.</div></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"8 ","pages":"Article 100200"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine update","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666990025000242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background and Objective

: The intracluster correlation coefficient (ICC) is a critical parameter to assess the degree of similarity or correlation between observations within the same cluster or group. It is commonly applied in cluster-randomized trials to estimate average within-cluster correlation. Although methods to estimate ICC exist for binary, continuous, and survival data, a new resampling-based approach has been developed for nominal or ordinal responses with more than two categories. The objective of this paper is to present both the resampling methods estimator and method of moments (MoM) based estimator for categorical ICC estimation. To facilitate the adoption and use of these estimators we developed an R package, iccmult, which calculates the ICC point estimate and confidence interval (CI) for categorical response data under each of these two methods.

Methods

: In this paper we incorporated the resampling based estimation method and MoM originally developed to characterize population genetic structure. A simulation study was conducted to compare estimates from MoM to the resampling method under different event rates, varying numbers of clusters, and various cluster sizes. The iccmult package provides two estimates of ICC and its CI, computed using these two methods. Additionally, the package also generates clustered categorical response data.

Results

: The iccmult package provides two functions for users. The function rccat() generates clustered categorical data, while the function iccmulti() estimates ICC and its CI. The simulation study revealed that the resampling and MoM methods perform nearly identically in estimating population ICC. However, the MoM method demonstrated greater precision in scenarios with fewer clusters and smaller cluster sizes.

Conclusions

: The R package iccmult offers easy-to-use ways to generate clustered categorical data and estimate ICC and its CI for a nominal or ordinal response using different methods. The package is freely available for use with R from the CRAN repository (https://cran.r-project.org/package=iccmult). We believe that this package can be a very useful tool for researchers designing cluster randomized trials with a categorical outcome.
R包估计簇内相关系数的名义和序数数据
背景与目的:聚类内相关系数(intraccluster correlation coefficient, ICC)是评估同一聚类或组内观测值之间相似或相关程度的关键参数。它通常应用于聚类随机试验中,以估计平均聚类内相关性。虽然对于二值、连续和生存数据存在估计ICC的方法,但是对于两类以上的标称或有序响应,已经开发了一种新的基于重采样的方法。本文的目的是提出重采样方法估计器和基于矩量法(MoM)的估计器。为了便于采用和使用这些估计器,我们开发了一个R包iccmult,它可以计算这两种方法下分类响应数据的ICC点估计和置信区间(CI)。方法:本文将基于重采样的估计方法与最初发展的种群遗传结构分析方法相结合。通过模拟研究,比较了在不同事件率、不同簇数和不同簇大小的情况下,MoM与重采样方法的估计结果。iccmult包提供了ICC及其CI的两种估计,使用这两种方法计算。此外,该包还生成聚类分类响应数据。结果:iccmult包为用户提供了两个功能。函数rccat()生成聚类分类数据,而函数iccmulti()估计ICC及其CI。仿真研究表明,重采样法和MoM法在估计总体ICC方面的性能几乎相同。然而,MoM方法在较少簇和较小簇大小的情况下显示出更高的精度。结论:R包iccmult提供了易于使用的方法来生成聚类分类数据,并使用不同的方法估计名义或有序响应的ICC及其CI。该包可以从CRAN存储库(https://cran.r-project.org/package=iccmult)免费与R一起使用。我们相信这个包可以是一个非常有用的工具,研究人员设计集群随机试验与分类结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.90
自引率
0.00%
发文量
0
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信