{"title":"Eight reasons not to test for baseline group equivalence in a parallel groups pretest-posttest study","authors":"Seth Lindstromberg","doi":"10.1016/j.rmal.2025.100254","DOIUrl":null,"url":null,"abstract":"<div><div>The parallel groups pretest-posttest design has long been prominent in quantitative research of SLA. Ideally, groups are formed by random assignment of individuals. But with or without random assignment, groups may differ substantially on key pre-treatment measures such as pretest scores. When faced with non-equivalent groups, many SLA researchers have tested the difference(s) for statistical significance in the belief that <em>p</em> > .05 allows a main statistical analysis which assumes that the pretreatment group means do not differ. The literature of applied statistics includes numerous accounts of why such “baseline equivalence” (BE) testing is misguided. Yet BE tests continue to be reported in SLA journals at all levels of reputation. This paper describes BE testing, reviews its flaws, shows that the practice persists, and discusses possible reasons why BE tests may be thought to be legitimate, and considers options in study planning that lead to superior results and avoid conditions that appear to make BE testing necessary.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 3","pages":"Article 100254"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Methods in Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772766125000758","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The parallel groups pretest-posttest design has long been prominent in quantitative research of SLA. Ideally, groups are formed by random assignment of individuals. But with or without random assignment, groups may differ substantially on key pre-treatment measures such as pretest scores. When faced with non-equivalent groups, many SLA researchers have tested the difference(s) for statistical significance in the belief that p > .05 allows a main statistical analysis which assumes that the pretreatment group means do not differ. The literature of applied statistics includes numerous accounts of why such “baseline equivalence” (BE) testing is misguided. Yet BE tests continue to be reported in SLA journals at all levels of reputation. This paper describes BE testing, reviews its flaws, shows that the practice persists, and discusses possible reasons why BE tests may be thought to be legitimate, and considers options in study planning that lead to superior results and avoid conditions that appear to make BE testing necessary.