{"title":"Reevaluating the reliability of common multiple comparison tests","authors":"Soner Yiğit","doi":"10.1002/agj2.70141","DOIUrl":null,"url":null,"abstract":"<p>Many comparison tests are available to determine treatment differences. The validity of these tests is commonly assessed using the Type I error rate. Type I error is obtaining a false positive result. It is known that Fisher's Least Significant Difference (LSD) and Duncan tests have high Type I error rates because even a single false positive is sufficient to constitute a Type I error. For multiple comparison tests, the number of correct decisions (true positives and true negatives) is more important than the Type I error rate. Therefore, in this study, specificity and sensitivity were considered alongside the Type I error rate. Specificity refers to the true negative rate, while sensitivity refers to the true positive rate. A Monte Carlo simulation showed that the LSD and Duncan tests had relatively high Type I error rates; however, when specificity was considered, the LSD and Duncan tests correctly predicted statistically similar groups (true negative) with a mean of 97.00%, while other tests achieved 99.00%. Regarding sensitivity, the LSD and Duncan tests correctly identified statistically different groups (true positive) with a mean of 15.00%, while other tests achieved 3.00%. The true negative rate of the other tests is 1.02 times (99.00/97.00) that of LSD and Duncan. In contrast, the true positive rate of LSD and Duncan is 5.00 times (15.00/3.00) that of the other tests. Therefore, considering both specificity and sensitivity, the LSD and Duncan tests were found to be superior to others. In conclusion, these tests were shown to be more reliable.</p>","PeriodicalId":7522,"journal":{"name":"Agronomy Journal","volume":"117 4","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy Journal","FirstCategoryId":"97","ListUrlMain":"https://acsess.onlinelibrary.wiley.com/doi/10.1002/agj2.70141","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
Many comparison tests are available to determine treatment differences. The validity of these tests is commonly assessed using the Type I error rate. Type I error is obtaining a false positive result. It is known that Fisher's Least Significant Difference (LSD) and Duncan tests have high Type I error rates because even a single false positive is sufficient to constitute a Type I error. For multiple comparison tests, the number of correct decisions (true positives and true negatives) is more important than the Type I error rate. Therefore, in this study, specificity and sensitivity were considered alongside the Type I error rate. Specificity refers to the true negative rate, while sensitivity refers to the true positive rate. A Monte Carlo simulation showed that the LSD and Duncan tests had relatively high Type I error rates; however, when specificity was considered, the LSD and Duncan tests correctly predicted statistically similar groups (true negative) with a mean of 97.00%, while other tests achieved 99.00%. Regarding sensitivity, the LSD and Duncan tests correctly identified statistically different groups (true positive) with a mean of 15.00%, while other tests achieved 3.00%. The true negative rate of the other tests is 1.02 times (99.00/97.00) that of LSD and Duncan. In contrast, the true positive rate of LSD and Duncan is 5.00 times (15.00/3.00) that of the other tests. Therefore, considering both specificity and sensitivity, the LSD and Duncan tests were found to be superior to others. In conclusion, these tests were shown to be more reliable.
期刊介绍:
After critical review and approval by the editorial board, AJ publishes articles reporting research findings in soil–plant relationships; crop science; soil science; biometry; crop, soil, pasture, and range management; crop, forage, and pasture production and utilization; turfgrass; agroclimatology; agronomic models; integrated pest management; integrated agricultural systems; and various aspects of entomology, weed science, animal science, plant pathology, and agricultural economics as applied to production agriculture.
Notes are published about apparatus, observations, and experimental techniques. Observations usually are limited to studies and reports of unrepeatable phenomena or other unique circumstances. Review and interpretation papers are also published, subject to standard review. Contributions to the Forum section deal with current agronomic issues and questions in brief, thought-provoking form. Such papers are reviewed by the editor in consultation with the editorial board.