Minimum sample size calculation for radiomics-based binary outcome prediction models: Theoretical framework and practical example

IF 5.3 1区 医学 Q1 ONCOLOGY
Qian Cao , Zhaoyu Jiang , Zhixiang Wang , Leonard Wee , Andre Dekker , Zhen Zhang , Ji Zhu
{"title":"Minimum sample size calculation for radiomics-based binary outcome prediction models: Theoretical framework and practical example","authors":"Qian Cao ,&nbsp;Zhaoyu Jiang ,&nbsp;Zhixiang Wang ,&nbsp;Leonard Wee ,&nbsp;Andre Dekker ,&nbsp;Zhen Zhang ,&nbsp;Ji Zhu","doi":"10.1016/j.radonc.2025.111134","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and purpose</h3><div>Determining the appropriate sample size for developing robust radiomics-based binary outcome prediction models and identifying the maximum number of predictors safely allowable within a fixed dataset size remain critical yet challenging tasks. This study aims to propose and demonstrate a structured method for addressing these issues, enhancing methodological rigor and practicality in radiomics research.</div></div><div><h3>Materials and methods</h3><div>We introduce a comprehensive sample size calculation framework for binary outcome prediction models in radiomic studies. The proposed approach integrates three key criteria: (1) maintaining a global shrinkage factor (<em>S</em>) ≥ 0.9 to control model overfitting, (2) ensuring a minimal absolute difference between apparent and adjusted performance metrics, and (3) precisely estimating the overall outcome risk. Additionally, we develop an accessible online calculation tool enabling researchers to efficiently determine either the minimum sample size or the maximum number of predictors permissible, based on clearly defined statistical parameters.</div></div><div><h3>Results</h3><div>The presented method systematically addresses model overfitting by integrating a global shrinkage factor into the calculation, providing robust estimates compared with traditional heuristic approaches (“rules of thumb”). Practical examples demonstrate that this structured method effectively balances predictive accuracy and generalizability, while the online tool provides researchers with a user-friendly platform to perform the necessary calculations.</div></div><div><h3>Conclusion</h3><div>Clear justification of sample size decisions is essential for developing reliable predictive models in radiomics research. By adopting a structured and rigorous calculation method, researchers can effectively minimize overfitting, ensure accurate risk estimation, and substantially enhance the reliability and validity of their predictive models.</div></div>","PeriodicalId":21041,"journal":{"name":"Radiotherapy and Oncology","volume":"212 ","pages":"Article 111134"},"PeriodicalIF":5.3000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiotherapy and Oncology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167814025046389","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background and purpose

Determining the appropriate sample size for developing robust radiomics-based binary outcome prediction models and identifying the maximum number of predictors safely allowable within a fixed dataset size remain critical yet challenging tasks. This study aims to propose and demonstrate a structured method for addressing these issues, enhancing methodological rigor and practicality in radiomics research.

Materials and methods

We introduce a comprehensive sample size calculation framework for binary outcome prediction models in radiomic studies. The proposed approach integrates three key criteria: (1) maintaining a global shrinkage factor (S) ≥ 0.9 to control model overfitting, (2) ensuring a minimal absolute difference between apparent and adjusted performance metrics, and (3) precisely estimating the overall outcome risk. Additionally, we develop an accessible online calculation tool enabling researchers to efficiently determine either the minimum sample size or the maximum number of predictors permissible, based on clearly defined statistical parameters.

Results

The presented method systematically addresses model overfitting by integrating a global shrinkage factor into the calculation, providing robust estimates compared with traditional heuristic approaches (“rules of thumb”). Practical examples demonstrate that this structured method effectively balances predictive accuracy and generalizability, while the online tool provides researchers with a user-friendly platform to perform the necessary calculations.

Conclusion

Clear justification of sample size decisions is essential for developing reliable predictive models in radiomics research. By adopting a structured and rigorous calculation method, researchers can effectively minimize overfitting, ensure accurate risk estimation, and substantially enhance the reliability and validity of their predictive models.
基于放射组学的二元结果预测模型的最小样本量计算:理论框架和实例。
背景和目的:确定适当的样本量,以开发稳健的基于放射组学的二元结果预测模型,并确定在固定数据集大小内安全允许的最大预测因子数量,仍然是关键但具有挑战性的任务。本研究旨在提出并展示一种结构化的方法来解决这些问题,提高放射组学研究方法的严谨性和实用性。材料和方法:我们为放射学研究中的二元结果预测模型引入了一个全面的样本量计算框架。该方法集成了三个关键标准:(1)保持全局收缩因子(S) ≥ 0.9以控制模型过拟合;(2)确保表观和调整后的性能指标之间的绝对差异最小;(3)精确估计总体结果风险。此外,我们开发了一个可访问的在线计算工具,使研究人员能够根据明确定义的统计参数有效地确定最小样本量或允许的最大预测因子数量。结果:所提出的方法通过将全局收缩因子集成到计算中,系统地解决了模型过拟合问题,与传统的启发式方法(“经验法则”)相比,提供了稳健的估计。实际示例表明,这种结构化方法有效地平衡了预测准确性和泛化性,而在线工具为研究人员提供了一个用户友好的平台来执行必要的计算。结论:在放射组学研究中,明确的样本量决定是建立可靠的预测模型所必需的。通过采用结构化、严谨的计算方法,可以有效地减少过拟合,保证风险估计的准确性,大大提高预测模型的可靠性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Radiotherapy and Oncology
Radiotherapy and Oncology 医学-核医学
CiteScore
10.30
自引率
10.50%
发文量
2445
审稿时长
45 days
期刊介绍: Radiotherapy and Oncology publishes papers describing original research as well as review articles. It covers areas of interest relating to radiation oncology. This includes: clinical radiotherapy, combined modality treatment, translational studies, epidemiological outcomes, imaging, dosimetry, and radiation therapy planning, experimental work in radiobiology, chemobiology, hyperthermia and tumour biology, as well as data science in radiation oncology and physics aspects relevant to oncology.Papers on more general aspects of interest to the radiation oncologist including chemotherapy, surgery and immunology are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信