{"title":"Simulation-based design optimization for statistical power: Utilizing machine learning.","authors":"Felix Zimmer, Rudolf Debelak","doi":"10.1037/met0000611","DOIUrl":null,"url":null,"abstract":"<p><p>The planning of adequately powered research designs increasingly goes beyond determining a suitable sample size. More challenging scenarios demand simultaneous tuning of multiple design parameter dimensions and can only be addressed using Monte Carlo simulation if no analytical approach is available. In addition, cost considerations, for example, in terms of monetary costs, are a relevant target for optimization. In this context, optimal design parameters can imply a desired level of power at minimum cost or maximum power at a cost threshold. We introduce a surrogate modeling framework based on machine learning predictions to solve these optimization tasks. In a simulation study, we demonstrate the efficiency for a wide range of hypothesis testing scenarios with single- and multidimensional design parameters, including t tests, analysis of variance, item response theory models, multilevel models, and multiple imputations. Our framework provides an algorithmic solution for optimizing study designs when no analytic power analysis is available, handling multiple design dimensions and cost considerations. Our implementation is publicly available in the R package mlpwr. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"513-536"},"PeriodicalIF":7.8000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000611","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The planning of adequately powered research designs increasingly goes beyond determining a suitable sample size. More challenging scenarios demand simultaneous tuning of multiple design parameter dimensions and can only be addressed using Monte Carlo simulation if no analytical approach is available. In addition, cost considerations, for example, in terms of monetary costs, are a relevant target for optimization. In this context, optimal design parameters can imply a desired level of power at minimum cost or maximum power at a cost threshold. We introduce a surrogate modeling framework based on machine learning predictions to solve these optimization tasks. In a simulation study, we demonstrate the efficiency for a wide range of hypothesis testing scenarios with single- and multidimensional design parameters, including t tests, analysis of variance, item response theory models, multilevel models, and multiple imputations. Our framework provides an algorithmic solution for optimizing study designs when no analytic power analysis is available, handling multiple design dimensions and cost considerations. Our implementation is publicly available in the R package mlpwr. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
规划有充分动力的研究设计越来越多地超出了确定合适样本量的范围。更具挑战性的方案需要同时调整多个设计参数维度,如果没有分析方法,只能通过蒙特卡罗模拟来解决。此外,成本考虑(例如货币成本)也是优化的相关目标。在这种情况下,最佳设计参数可能意味着以最低成本获得所需的功率水平,或以某一成本阈值获得最大功率。我们引入了一个基于机器学习预测的代理建模框架来解决这些优化任务。在一项模拟研究中,我们展示了该框架在具有单维和多维设计参数的各种假设检验情况下的效率,包括 t 检验、方差分析、项目反应理论模型、多层次模型和多重归因。我们的框架提供了一种算法解决方案,用于在没有分析功率分析的情况下优化研究设计,处理多个设计维度和成本考虑因素。我们的实现方法在 R 软件包 mlpwr 中公开发布。(PsycInfo Database Record (c) 2023 APA, all rights reserved)。
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.