Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor.

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods Pub Date : 2023-06-01 DOI:10.1037/met0000402

Maximilian Linde, Jorge N Tendeiro, Ravi Selker, Eric-Jan Wagenmakers, Don van Ravenzwaaij

{"title":"Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor.","authors":"Maximilian Linde, Jorge N Tendeiro, Ravi Selker, Eric-Jan Wagenmakers, Don van Ravenzwaaij","doi":"10.1037/met0000402","DOIUrl":null,"url":null,"abstract":"<p><p>Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"740-755"},"PeriodicalIF":7.6000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000402","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

查看原文本刊更多论文

关于等效性的决定:TOST、HDI-ROPE和贝叶斯因子的比较。

一些重要的研究问题需要有能力找到两种情况实际上相等的证据。这在传统的频率主义零假设显著性检验框架中是不可能完成的;因此，必须使用其他方法。我们解释并举例说明了三种寻找等价证据的方法:频率双单侧检验法、贝叶斯实际等价的最高密度区间区域法和贝叶斯因子区间零法。我们比较了这三种方法在各种可能场景下的分类性能。结果表明，贝叶斯因子区间零方法在统计能力方面优于其他两种方法。关键是，与贝叶斯因子区间零过程相比，当样本量相对较小时，实际等效过程的两个单侧检验和最高密度区间区域的判别能力有限:具体而言，为了实际有用，当使用相当大的等效裕度(约为0.2或0.3)时，这两种方法通常需要在每个条件下超过250个病例;对于较小的等效边距，甚至需要更多的情况。由于这些结果，我们建议研究人员更多地依赖贝叶斯因子区间零方法来量化等效性的证据，特别是对于受样本量限制的研究。(PsycInfo数据库记录(c) 2023 APA，版权所有)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Psychological methods PSYCHOLOGY, MULTIDISCIPLINARY-

CiteScore

13.10

自引率

7.10%

发文量

159

期刊介绍： Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.