Discussion on the validity of commonly used reliability indices in sports medicine and exercise science: a critical review with data simulations.

IF 2.8 3区 医学 Q2 PHYSIOLOGY
European Journal of Applied Physiology Pub Date : 2025-06-01 Epub Date: 2025-02-13 DOI:10.1007/s00421-025-05720-6
Konstantin Warneke, Thomas Gronwald, Sebastian Wallot, Alessia Magno, Martin Hillebrecht, Klaus Wirth
{"title":"Discussion on the validity of commonly used reliability indices in sports medicine and exercise science: a critical review with data simulations.","authors":"Konstantin Warneke, Thomas Gronwald, Sebastian Wallot, Alessia Magno, Martin Hillebrecht, Klaus Wirth","doi":"10.1007/s00421-025-05720-6","DOIUrl":null,"url":null,"abstract":"<p><p>Apart from objectivity and validity, reliability is considered a precondition for testing within scientific works, as unreliable testing protocols limit conclusions, especially for practical application. Classification guidelines commonly refer to relative reliability, focusing on Pearson correlation coefficients (r<sub>p</sub>) and intraclass correlation coefficients (ICC). On those, the standard error of measurement (SEM) and the minimal detectable change (MDC) are often calculated in addition to the variability coefficient (CV). These, however, do not account for systematic or random errors (e.g., standardization problems). To illustrate, we applied common reliability statistics in sports science on simulated data which extended the sample size of two original counter-movement-jump sessions from (youth) elite basketball players. These show that excellent r<sub>p</sub> and ICC (≥ 0.9) without a systematic bias were accompanied by a mean absolute percentage error of over 20%. Furthermore, we showed that the ICC does not account for systematic errors and has only limited value for accuracy, which can cause misleading conclusions of data. While a simple re-organization of data caused an improvement in relative reliability and reduced limits of agreement meaningfully, systematic errors occurred. This example underlines the lack of validity and objectivity of commonly used ICC-based reliability statistics (SEM, MDC) to quantify the primary and secondary variance sources. After revealing several caveats in the literature (e.g., neglecting of the systematic and random error or not distinguishing between protocol and device reliability), we suggest a methodological approach to provide reliable data collections as a precondition for valid conclusions by, e.g., recommending pre-set acceptable measurement errors.</p>","PeriodicalId":12005,"journal":{"name":"European Journal of Applied Physiology","volume":" ","pages":"1511-1526"},"PeriodicalIF":2.8000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12174282/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Applied Physiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00421-025-05720-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PHYSIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Apart from objectivity and validity, reliability is considered a precondition for testing within scientific works, as unreliable testing protocols limit conclusions, especially for practical application. Classification guidelines commonly refer to relative reliability, focusing on Pearson correlation coefficients (rp) and intraclass correlation coefficients (ICC). On those, the standard error of measurement (SEM) and the minimal detectable change (MDC) are often calculated in addition to the variability coefficient (CV). These, however, do not account for systematic or random errors (e.g., standardization problems). To illustrate, we applied common reliability statistics in sports science on simulated data which extended the sample size of two original counter-movement-jump sessions from (youth) elite basketball players. These show that excellent rp and ICC (≥ 0.9) without a systematic bias were accompanied by a mean absolute percentage error of over 20%. Furthermore, we showed that the ICC does not account for systematic errors and has only limited value for accuracy, which can cause misleading conclusions of data. While a simple re-organization of data caused an improvement in relative reliability and reduced limits of agreement meaningfully, systematic errors occurred. This example underlines the lack of validity and objectivity of commonly used ICC-based reliability statistics (SEM, MDC) to quantify the primary and secondary variance sources. After revealing several caveats in the literature (e.g., neglecting of the systematic and random error or not distinguishing between protocol and device reliability), we suggest a methodological approach to provide reliable data collections as a precondition for valid conclusions by, e.g., recommending pre-set acceptable measurement errors.

运动医学和运动科学中常用信度指标效度的探讨:数据模拟综述。
除了客观性和有效性之外,可靠性被认为是科学工作中测试的先决条件,因为不可靠的测试方案限制了结论,特别是在实际应用中。分类指南通常涉及相对信度,重点关注Pearson相关系数(rp)和class内相关系数(ICC)。在这些问题上,除了变异系数(CV)之外,通常还计算测量的标准误差(SEM)和最小可检测变化(MDC)。然而,这些并不能解释系统或随机错误(例如,标准化问题)。为了说明这一点,我们将体育科学中的常见信度统计应用于模拟数据,该数据扩展了来自(青年)精英篮球运动员的两个原始反动作-跳跃会话的样本量。这些结果表明,没有系统偏倚的优秀rp和ICC(≥0.9)伴随着平均绝对百分比误差超过20%。此外,我们表明ICC没有考虑到系统误差,只有有限的准确性价值,这可能导致数据的误导性结论。虽然对数据进行简单的重组可以提高相对可靠性,并有意义地减少一致性的限制,但也发生了系统错误。这个例子强调了常用的基于icc的可靠性统计(SEM, MDC)在量化主要和次要方差源方面缺乏有效性和客观性。在揭示了文献中的几个警告之后(例如,忽略了系统和随机误差,或者没有区分协议和设备可靠性),我们建议一种方法学方法,通过推荐预设可接受的测量误差来提供可靠的数据收集,作为有效结论的先决条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.00
自引率
6.70%
发文量
227
审稿时长
3 months
期刊介绍: The European Journal of Applied Physiology (EJAP) aims to promote mechanistic advances in human integrative and translational physiology. Physiology is viewed broadly, having overlapping context with related disciplines such as biomechanics, biochemistry, endocrinology, ergonomics, immunology, motor control, and nutrition. EJAP welcomes studies dealing with physical exercise, training and performance. Studies addressing physiological mechanisms are preferred over descriptive studies. Papers dealing with animal models or pathophysiological conditions are not excluded from consideration, but must be clearly relevant to human physiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信