验证报告：Vahey et al.（2015）的关键再分析“临床领域内隐关系评估程序（IRAP）标准效果的荟萃分析”。

IF 1.7 4区医学 Q3 PSYCHIATRY

Journal of Behavior Therapy and Experimental Psychiatry Pub Date : 2025-01-15 DOI:10.1016/j.jbtep.2024.102015

Ian Hussey

{"title":"验证报告：Vahey et al.（2015）的关键再分析“临床领域内隐关系评估程序（IRAP）标准效果的荟萃分析”。","authors":"Ian Hussey","doi":"10.1016/j.jbtep.2024.102015","DOIUrl":null,"url":null,"abstract":"<div><div>The meta-analysis reported in Vahey et al. (2015) concluded that the Implicit Relational Assessment Procedure (IRAP) has high clinical criterion validity (meta-analytic <span><math><mrow><mover><mi>r</mi><mo>‾</mo></mover></mrow></math></span> = .45) and therefore “the potential of the IRAP as a tool for clinical assessment” (p. 64). Vahey et al. (2015) also reported power analyses, and the article is frequently cited for sample size determination in IRAP studies, especially their heuristic of <em>N</em> > 37. This article attempts to verify those results. Results were found to have very poor reproducibility at almost every stage of the data extraction and analysis with errors generally biased towards inflating the effect size. The reported meta-analysis results were found to be mathematically implausible and could not be reproduced despite numerous attempts. Multiple internal discrepancies were found in the effect sizes such as between the forest plot and funnel plot, and between the forest plot and the supplementary data. 23 of the 56 (41.1%) individual effect sizes were not actually criterion effects and did not meet the original inclusion criteria. The original results were also undermined by combining effect sizes with different estimands. Reextraction of effect sizes from the original articles revealed 360 additional effect sizes that met inclusion criteria that should have been included in the original analysis. Examples of selection bias in the inclusion of larger effect sizes were observed. A new meta-analysis was calculated to understand the compound impact of these errors (i.e., without endorsing its results as a valid estimate of the IRAP's criterion validity). The effect size was half the size of the original (<span><math><mrow><mover><mi>r</mi><mo>‾</mo></mover></mrow></math></span> = .22), and the power analyses recommended sample sizes nearly 10 times larger than the original (<em>N</em> > 346), which no published original study using the IRAP has met. In aggregate, this seriously undermines the credibility and utility of the original article's conclusions and recommendations. Vahey et al. (2015) appears to need substantial correction at minimum. In particular, researchers should not rely on its results for sample size justification. A list of suggestions for error detection in meta-analyses is provided.</div></div>","PeriodicalId":48198,"journal":{"name":"Journal of Behavior Therapy and Experimental Psychiatry","volume":"87 ","pages":"Article 102015"},"PeriodicalIF":1.7000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Verification report: A critical reanalysis of Vahey et al. (2015) “A meta-analysis of criterion effects for the Implicit Relational Assessment Procedure (IRAP) in the clinical domain”\",\"authors\":\"Ian Hussey\",\"doi\":\"10.1016/j.jbtep.2024.102015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The meta-analysis reported in Vahey et al. (2015) concluded that the Implicit Relational Assessment Procedure (IRAP) has high clinical criterion validity (meta-analytic <span><math><mrow><mover><mi>r</mi><mo>‾</mo></mover></mrow></math></span> = .45) and therefore “the potential of the IRAP as a tool for clinical assessment” (p. 64). Vahey et al. (2015) also reported power analyses, and the article is frequently cited for sample size determination in IRAP studies, especially their heuristic of <em>N</em> > 37. This article attempts to verify those results. Results were found to have very poor reproducibility at almost every stage of the data extraction and analysis with errors generally biased towards inflating the effect size. The reported meta-analysis results were found to be mathematically implausible and could not be reproduced despite numerous attempts. Multiple internal discrepancies were found in the effect sizes such as between the forest plot and funnel plot, and between the forest plot and the supplementary data. 23 of the 56 (41.1%) individual effect sizes were not actually criterion effects and did not meet the original inclusion criteria. The original results were also undermined by combining effect sizes with different estimands. Reextraction of effect sizes from the original articles revealed 360 additional effect sizes that met inclusion criteria that should have been included in the original analysis. Examples of selection bias in the inclusion of larger effect sizes were observed. A new meta-analysis was calculated to understand the compound impact of these errors (i.e., without endorsing its results as a valid estimate of the IRAP's criterion validity). The effect size was half the size of the original (<span><math><mrow><mover><mi>r</mi><mo>‾</mo></mover></mrow></math></span> = .22), and the power analyses recommended sample sizes nearly 10 times larger than the original (<em>N</em> > 346), which no published original study using the IRAP has met. In aggregate, this seriously undermines the credibility and utility of the original article's conclusions and recommendations. Vahey et al. (2015) appears to need substantial correction at minimum. In particular, researchers should not rely on its results for sample size justification. A list of suggestions for error detection in meta-analyses is provided.</div></div>\",\"PeriodicalId\":48198,\"journal\":{\"name\":\"Journal of Behavior Therapy and Experimental Psychiatry\",\"volume\":\"87 \",\"pages\":\"Article 102015\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Behavior Therapy and Experimental Psychiatry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0005791624000740\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Behavior Therapy and Experimental Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0005791624000740","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

摘要

Vahey等人（2015）报告的荟萃分析得出结论，隐性关系评估程序（IRAP）具有很高的临床标准效度（荟萃分析r = .45），因此“IRAP作为临床评估工具的潜力”（第64页）。Vahey等人（2015）也报道了功率分析，该文章经常被引用用于IRAP研究的样本量确定，特别是他们的启发式N > 37。本文试图验证这些结果。在数据提取和分析的几乎每个阶段，结果的可重复性都很差，误差通常倾向于夸大效应大小。报告的荟萃分析结果被发现在数学上是不可信的，尽管多次尝试也无法复制。在森林图与漏斗图之间、森林图与补充数据之间的效应大小存在多重内部差异。56个个体效应量中有23个（41.1%）实际上不是标准效应，不符合最初的纳入标准。将效应大小与不同的估计相结合，也削弱了最初的结果。从原始文章中重新提取效应量，发现了360个符合纳入标准的额外效应量，这些效应量本应包括在原始分析中。在纳入较大效应量时观察到选择偏差的例子。计算了一项新的荟萃分析，以了解这些错误的复合影响（即，不认可其结果作为IRAP标准效度的有效估计）。效果大小是原来的一半（r = .22），功率分析建议的样本量比原来（N > 346）大近10倍，没有发表的原始研究使用IRAP达到。总的来说，这严重破坏了原文章的结论和建议的可信性和实用性。Vahey等人（2015）似乎至少需要实质性的修正。特别是，研究人员不应该依靠它的结果来证明样本量。提供了一份在荟萃分析中错误检测的建议清单。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Verification report: A critical reanalysis of Vahey et al. (2015) “A meta-analysis of criterion effects for the Implicit Relational Assessment Procedure (IRAP) in the clinical domain”

The meta-analysis reported in Vahey et al. (2015) concluded that the Implicit Relational Assessment Procedure (IRAP) has high clinical criterion validity (meta-analytic

\overline{r}

= .45) and therefore “the potential of the IRAP as a tool for clinical assessment” (p. 64). Vahey et al. (2015) also reported power analyses, and the article is frequently cited for sample size determination in IRAP studies, especially their heuristic of N > 37. This article attempts to verify those results. Results were found to have very poor reproducibility at almost every stage of the data extraction and analysis with errors generally biased towards inflating the effect size. The reported meta-analysis results were found to be mathematically implausible and could not be reproduced despite numerous attempts. Multiple internal discrepancies were found in the effect sizes such as between the forest plot and funnel plot, and between the forest plot and the supplementary data. 23 of the 56 (41.1%) individual effect sizes were not actually criterion effects and did not meet the original inclusion criteria. The original results were also undermined by combining effect sizes with different estimands. Reextraction of effect sizes from the original articles revealed 360 additional effect sizes that met inclusion criteria that should have been included in the original analysis. Examples of selection bias in the inclusion of larger effect sizes were observed. A new meta-analysis was calculated to understand the compound impact of these errors (i.e., without endorsing its results as a valid estimate of the IRAP's criterion validity). The effect size was half the size of the original (

\overline{r}

= .22), and the power analyses recommended sample sizes nearly 10 times larger than the original (N > 346), which no published original study using the IRAP has met. In aggregate, this seriously undermines the credibility and utility of the original article's conclusions and recommendations. Vahey et al. (2015) appears to need substantial correction at minimum. In particular, researchers should not rely on its results for sample size justification. A list of suggestions for error detection in meta-analyses is provided.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Behavior Therapy and Experimental Psychiatry Multiple-

CiteScore

3.60

自引率

5.60%

发文量

期刊介绍： The publication of the book Psychotherapy by Reciprocal Inhibition (1958) by the co-founding editor of this Journal, Joseph Wolpe, marked a major change in the understanding and treatment of mental disorders. The book used principles from empirical behavioral science to explain psychopathological phenomena and the resulting explanations were critically tested and used to derive effective treatments. The second half of the 20th century saw this rigorous scientific approach come to fruition. Experimental approaches to psychopathology, in particular those used to test conditioning theories and cognitive theories, have steadily expanded, and experimental analysis of processes characterising and maintaining mental disorders have become an established research area.