{"title":"等效检验判断模型拟合:蒙特卡罗模拟。","authors":"James L Peugh, Kaylee Litson, David F Feldon","doi":"10.1037/met0000591","DOIUrl":null,"url":null,"abstract":"<p><p>Decades of published methodological research have shown the chi-square test of model fit performs inconsistently and unreliably as a determinant of structural equation model (SEM) fit. Likewise, SEM indices of model fit, such as comparative fit index (CFI) and root-mean-square error of approximation (RMSEA) also perform inconsistently and unreliably. Despite rather unreliable ways to statistically assess model fit, researchers commonly rely on these methods for lack of a suitable inferential alternative. Marcoulides and Yuan (2017) have proposed the first inferential test of SEM fit in many years: an equivalence test adaptation of the RMSEA and CFI indices (i.e., RMSEA<sub><i>t</i></sub> and CFI<i><sub>t</sub></i>). However, the ability of this equivalence testing approach to accurately judge acceptable and unacceptable model fit has not been empirically tested. This fully crossed Monte Carlo simulation evaluated the accuracy of equivalence testing combining many of the same independent variable (IV) conditions used in previous fit index simulation studies, including sample size (<i>N</i> = 100-1,000), model specification (correctly specified or misspecified), model type (confirmatory factor analysis [CFA], path analysis, or SEM), number of variables analyzed (low or high), data distribution (normal or skewed), and missing data (none, 10%, or 25%). Results show equivalence testing performs rather inconsistently and unreliably across IV conditions, with acceptable or unacceptable RMSEA<i><sub>t</sub></i> and CFIt model fit index values often being contingent on complex interactions among conditions. Proportional <i>z</i>-tests and logistic regression analyses indicated that equivalence tests of model fit are problematic under multiple conditions, especially those where models are mildly misspecified. Recommendations for researchers are offered, but with the provision that they be used with caution until more research and development is available. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"888-925"},"PeriodicalIF":7.8000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Equivalence testing to judge model fit: A Monte Carlo simulation.\",\"authors\":\"James L Peugh, Kaylee Litson, David F Feldon\",\"doi\":\"10.1037/met0000591\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Decades of published methodological research have shown the chi-square test of model fit performs inconsistently and unreliably as a determinant of structural equation model (SEM) fit. Likewise, SEM indices of model fit, such as comparative fit index (CFI) and root-mean-square error of approximation (RMSEA) also perform inconsistently and unreliably. Despite rather unreliable ways to statistically assess model fit, researchers commonly rely on these methods for lack of a suitable inferential alternative. Marcoulides and Yuan (2017) have proposed the first inferential test of SEM fit in many years: an equivalence test adaptation of the RMSEA and CFI indices (i.e., RMSEA<sub><i>t</i></sub> and CFI<i><sub>t</sub></i>). However, the ability of this equivalence testing approach to accurately judge acceptable and unacceptable model fit has not been empirically tested. This fully crossed Monte Carlo simulation evaluated the accuracy of equivalence testing combining many of the same independent variable (IV) conditions used in previous fit index simulation studies, including sample size (<i>N</i> = 100-1,000), model specification (correctly specified or misspecified), model type (confirmatory factor analysis [CFA], path analysis, or SEM), number of variables analyzed (low or high), data distribution (normal or skewed), and missing data (none, 10%, or 25%). Results show equivalence testing performs rather inconsistently and unreliably across IV conditions, with acceptable or unacceptable RMSEA<i><sub>t</sub></i> and CFIt model fit index values often being contingent on complex interactions among conditions. Proportional <i>z</i>-tests and logistic regression analyses indicated that equivalence tests of model fit are problematic under multiple conditions, especially those where models are mildly misspecified. Recommendations for researchers are offered, but with the provision that they be used with caution until more research and development is available. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":\" \",\"pages\":\"888-925\"},\"PeriodicalIF\":7.8000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000591\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/8/10 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000591","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/8/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
几十年发表的方法学研究表明,模型拟合的卡方检验作为结构方程模型(SEM)拟合的决定因素表现得不一致和不可靠。同样,模型拟合的SEM指标,如比较拟合指数(CFI)和近似均方根误差(RMSEA)也表现不一致和不可靠。尽管统计评估模型拟合的方法相当不可靠,但由于缺乏合适的推断替代方法,研究人员通常依赖这些方法。Marcoulides和Yuan(2017)提出了多年来第一个SEM拟合的推理检验:RMSEA和CFI指数(即RMSEAt和CFIt)的等效检验。然而,这种等效检验方法准确判断可接受和不可接受模型拟合的能力尚未得到实证检验。这个完全交叉的蒙特卡罗模拟评估了等效检验的准确性,结合了许多在以前的拟合指数模拟研究中使用的相同的自变量(IV)条件,包括样本量(N = 100- 1000)、模型规格(正确指定或错误指定)、模型类型(验证性因子分析[CFA]、路径分析或SEM)、分析的变量数量(低或高)、数据分布(正态或偏态)和缺失数据(无、10%或25%)。结果表明,等效性测试在IV条件下执行得相当不一致和不可靠,RMSEAt和CFIt模型拟合指数值通常取决于条件之间的复杂相互作用。比例z检验和逻辑回归分析表明,在多种条件下,模型拟合的等效检验存在问题,特别是在模型轻度错误指定的情况下。为研究人员提供了建议,但规定在有更多的研究和发展可用之前,要谨慎使用这些建议。(PsycInfo Database Record (c) 2025 APA,版权所有)。
Equivalence testing to judge model fit: A Monte Carlo simulation.
Decades of published methodological research have shown the chi-square test of model fit performs inconsistently and unreliably as a determinant of structural equation model (SEM) fit. Likewise, SEM indices of model fit, such as comparative fit index (CFI) and root-mean-square error of approximation (RMSEA) also perform inconsistently and unreliably. Despite rather unreliable ways to statistically assess model fit, researchers commonly rely on these methods for lack of a suitable inferential alternative. Marcoulides and Yuan (2017) have proposed the first inferential test of SEM fit in many years: an equivalence test adaptation of the RMSEA and CFI indices (i.e., RMSEAt and CFIt). However, the ability of this equivalence testing approach to accurately judge acceptable and unacceptable model fit has not been empirically tested. This fully crossed Monte Carlo simulation evaluated the accuracy of equivalence testing combining many of the same independent variable (IV) conditions used in previous fit index simulation studies, including sample size (N = 100-1,000), model specification (correctly specified or misspecified), model type (confirmatory factor analysis [CFA], path analysis, or SEM), number of variables analyzed (low or high), data distribution (normal or skewed), and missing data (none, 10%, or 25%). Results show equivalence testing performs rather inconsistently and unreliably across IV conditions, with acceptable or unacceptable RMSEAt and CFIt model fit index values often being contingent on complex interactions among conditions. Proportional z-tests and logistic regression analyses indicated that equivalence tests of model fit are problematic under multiple conditions, especially those where models are mildly misspecified. Recommendations for researchers are offered, but with the provision that they be used with caution until more research and development is available. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.