Reevaluating the reliability of common multiple comparison tests

IF 2 3区农林科学 Q2 AGRONOMY

Agronomy Journal Pub Date : 2025-08-15 DOI:10.1002/agj2.70141

Soner Yiğit

{"title":"Reevaluating the reliability of common multiple comparison tests","authors":"Soner Yiğit","doi":"10.1002/agj2.70141","DOIUrl":null,"url":null,"abstract":"<p>Many comparison tests are available to determine treatment differences. The validity of these tests is commonly assessed using the Type I error rate. Type I error is obtaining a false positive result. It is known that Fisher's Least Significant Difference (LSD) and Duncan tests have high Type I error rates because even a single false positive is sufficient to constitute a Type I error. For multiple comparison tests, the number of correct decisions (true positives and true negatives) is more important than the Type I error rate. Therefore, in this study, specificity and sensitivity were considered alongside the Type I error rate. Specificity refers to the true negative rate, while sensitivity refers to the true positive rate. A Monte Carlo simulation showed that the LSD and Duncan tests had relatively high Type I error rates; however, when specificity was considered, the LSD and Duncan tests correctly predicted statistically similar groups (true negative) with a mean of 97.00%, while other tests achieved 99.00%. Regarding sensitivity, the LSD and Duncan tests correctly identified statistically different groups (true positive) with a mean of 15.00%, while other tests achieved 3.00%. The true negative rate of the other tests is 1.02 times (99.00/97.00) that of LSD and Duncan. In contrast, the true positive rate of LSD and Duncan is 5.00 times (15.00/3.00) that of the other tests. Therefore, considering both specificity and sensitivity, the LSD and Duncan tests were found to be superior to others. In conclusion, these tests were shown to be more reliable.</p>","PeriodicalId":7522,"journal":{"name":"Agronomy Journal","volume":"117 4","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy Journal","FirstCategoryId":"97","ListUrlMain":"https://acsess.onlinelibrary.wiley.com/doi/10.1002/agj2.70141","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}

引用次数: 0

Abstract

Many comparison tests are available to determine treatment differences. The validity of these tests is commonly assessed using the Type I error rate. Type I error is obtaining a false positive result. It is known that Fisher's Least Significant Difference (LSD) and Duncan tests have high Type I error rates because even a single false positive is sufficient to constitute a Type I error. For multiple comparison tests, the number of correct decisions (true positives and true negatives) is more important than the Type I error rate. Therefore, in this study, specificity and sensitivity were considered alongside the Type I error rate. Specificity refers to the true negative rate, while sensitivity refers to the true positive rate. A Monte Carlo simulation showed that the LSD and Duncan tests had relatively high Type I error rates; however, when specificity was considered, the LSD and Duncan tests correctly predicted statistically similar groups (true negative) with a mean of 97.00%, while other tests achieved 99.00%. Regarding sensitivity, the LSD and Duncan tests correctly identified statistically different groups (true positive) with a mean of 15.00%, while other tests achieved 3.00%. The true negative rate of the other tests is 1.02 times (99.00/97.00) that of LSD and Duncan. In contrast, the true positive rate of LSD and Duncan is 5.00 times (15.00/3.00) that of the other tests. Therefore, considering both specificity and sensitivity, the LSD and Duncan tests were found to be superior to others. In conclusion, these tests were shown to be more reliable.

Abstract Image

查看原文本刊更多论文

重新评估常见多重比较检验的可靠性

有许多比较测试可用于确定治疗差异。这些测试的有效性通常用I型错误率来评估。第一类错误是获得假阳性结果。众所周知，Fisher's Least Significant Difference （LSD）和Duncan test具有很高的I型错误率，因为即使是一个假阳性也足以构成I型错误。对于多重比较测试，正确决策的数量（真阳性和真阴性）比I型错误率更重要。因此，在本研究中，除了I型错误率外，还考虑了特异性和敏感性。特异性指真阴性率，敏感性指真阳性率。蒙特卡罗模拟表明，LSD和Duncan测试具有相对较高的I型错误率；然而，当考虑特异性时，LSD和Duncan试验正确预测统计学上相似的组（真阴性），平均为97.00%，而其他试验达到99.00%。关于敏感性，LSD和Duncan测试正确识别统计学上不同的组（真阳性），平均为15.00%，而其他测试达到3.00%。其他试验的真阴性率是LSD和Duncan的1.02倍（99.00/97.00）。相比之下，LSD和Duncan的真阳性率是其他试验的5.00倍（15.00/3.00）。因此，从特异性和敏感性两方面考虑，LSD试验和Duncan试验均优于其他试验。总之，这些测试被证明是更可靠的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Agronomy Journal 农林科学-农艺学

CiteScore

4.70

自引率

9.50%

发文量

265

审稿时长

4.8 months

期刊介绍： After critical review and approval by the editorial board, AJ publishes articles reporting research findings in soil–plant relationships; crop science; soil science; biometry; crop, soil, pasture, and range management; crop, forage, and pasture production and utilization; turfgrass; agroclimatology; agronomic models; integrated pest management; integrated agricultural systems; and various aspects of entomology, weed science, animal science, plant pathology, and agricultural economics as applied to production agriculture. Notes are published about apparatus, observations, and experimental techniques. Observations usually are limited to studies and reports of unrepeatable phenomena or other unique circumstances. Review and interpretation papers are also published, subject to standard review. Contributions to the Forum section deal with current agronomic issues and questions in brief, thought-provoking form. Such papers are reviewed by the editor in consultation with the editorial board.