{"title":"Residual permutation tests for feature importance in machine learning.","authors":"Po-Hsien Huang","doi":"10.1111/bmsp.70009","DOIUrl":null,"url":null,"abstract":"<p><p>Psychological research has traditionally relied on linear models to test scientific hypotheses. However, the emergence of machine learning (ML) algorithms has opened new opportunities for exploring variable relationships beyond linear constraints. To interpret the outcomes of these 'black-box' algorithms, various tools for assessing feature importance have been developed. However, most of these tools are descriptive and do not facilitate statistical inference. To address this gap, our study introduces two versions of residual permutation tests (RPTs), designed to assess the significance of a target feature in predicting the label. The first variant, RPT on Y (RPT-Y), permutes the residuals of the label conditioned on features other than the target. The second variant, RPT on X (RPT-X), permutes the residuals of the target feature conditioned on the other features. Through a comprehensive simulation study, we show that RPT-X maintains empirical Type I error rates under the nominal level across a wide range of ML algorithms and demonstrates appropriate statistical power in both regression and classification contexts. These findings suggest the utility of RPT-X for hypothesis testing in ML applications.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"British Journal of Mathematical & Statistical Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1111/bmsp.70009","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Psychological research has traditionally relied on linear models to test scientific hypotheses. However, the emergence of machine learning (ML) algorithms has opened new opportunities for exploring variable relationships beyond linear constraints. To interpret the outcomes of these 'black-box' algorithms, various tools for assessing feature importance have been developed. However, most of these tools are descriptive and do not facilitate statistical inference. To address this gap, our study introduces two versions of residual permutation tests (RPTs), designed to assess the significance of a target feature in predicting the label. The first variant, RPT on Y (RPT-Y), permutes the residuals of the label conditioned on features other than the target. The second variant, RPT on X (RPT-X), permutes the residuals of the target feature conditioned on the other features. Through a comprehensive simulation study, we show that RPT-X maintains empirical Type I error rates under the nominal level across a wide range of ML algorithms and demonstrates appropriate statistical power in both regression and classification contexts. These findings suggest the utility of RPT-X for hypothesis testing in ML applications.
心理学研究传统上依靠线性模型来检验科学假设。然而,机器学习(ML)算法的出现为探索超越线性约束的变量关系开辟了新的机会。为了解释这些“黑盒”算法的结果,已经开发了各种评估特征重要性的工具。然而,这些工具大多是描述性的,不便于统计推断。为了解决这一差距,我们的研究引入了两个版本的残差排列测试(RPTs),旨在评估目标特征在预测标签中的重要性。第一种变体,RPT on Y (RPT-Y),根据目标以外的特征来排列标签的残差。第二个变体,RPT on X (RPT-X),将目标特征的残差以其他特征为条件进行排列。通过全面的模拟研究,我们表明RPT-X在广泛的ML算法中保持经验I型错误率低于标称水平,并在回归和分类上下文中显示出适当的统计能力。这些发现表明RPT-X在机器学习应用中的假设检验的效用。
期刊介绍:
The British Journal of Mathematical and Statistical Psychology publishes articles relating to areas of psychology which have a greater mathematical or statistical aspect of their argument than is usually acceptable to other journals including:
• mathematical psychology
• statistics
• psychometrics
• decision making
• psychophysics
• classification
• relevant areas of mathematics, computing and computer software
These include articles that address substantitive psychological issues or that develop and extend techniques useful to psychologists. New models for psychological processes, new approaches to existing data, critiques of existing models and improved algorithms for estimating the parameters of a model are examples of articles which may be favoured.