教育中的复制研究

IF 3.3 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Research and Evaluation Pub Date : 2022-02-17 DOI:10.1080/13803611.2021.2022307

Thomas Perry, B. See

{"title":"教育中的复制研究","authors":"Thomas Perry, B. See","doi":"10.1080/13803611.2021.2022307","DOIUrl":null,"url":null,"abstract":"The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"1 - 7"},"PeriodicalIF":3.3000,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Replication study in education\",\"authors\":\"Thomas Perry, B. See\",\"doi\":\"10.1080/13803611.2021.2022307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly\",\"PeriodicalId\":47025,\"journal\":{\"name\":\"Educational Research and Evaluation\",\"volume\":\"27 1\",\"pages\":\"1 - 7\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2022-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Educational Research and Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/13803611.2021.2022307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Research and Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/13803611.2021.2022307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 2

摘要

在过去的20年里，教育研究取得了许多进展，包括使用实验设计测试教育计划和政策的有力研究的增长（Hedges，2018），如随机对照试验（RCT）。随机对照试验一直是教育中复制研究的重点，研究人员试图在试验条件下复制类似的项目。这些复制产生了不同的结果，并引发了人们的疑问，即为什么有些结果成功复制，而另一些结果没有成功复制。最近教育基金会基金会有效性试验的结果就是一个很好的例子。一些教育方案在疗效试验中对年轻人的学习结果显示出有益的影响，但在更大规模的疗效试验中没有效果。这些计划的例子包括《儿童哲学》（Gorard等人，2018）、《开启（阅读恢复）》（Gorrd等人，2014）和《加速阅读》（Gorad等人，2015）。有些人可能会得出这样的结论：其中一种评价肯定是错误的。重要的是要认识到，在所有这些例子中，实施的背景和保真度都有所不同。在儿童哲学有效性试验中，53%的学校没有按预期实施干预（Lord等人，2021）。这就是有效性试验的性质，即在现实生活中提供方案，而在有效性试验中，将密切监测和控制方案的提供，并使用较小的样本。同样，随着评估的开启，尽管学校提供了所需的课程数量，但他们修改了干预的内容和提供形式（Patel等人，2017）。疗效和有效性试验之间也存在重要差异。疗效试验是针对一年级中学生进行的，而疗效试验则是针对小学生进行的。两次评估中使用的测试也有所不同。在疗效试验中，阅读是使用GL新组阅读测试（Gorard等人，2014）测量的，但在有效性试验中，使用的测试是Hodder组阅读测试。（Patel等人，2017）。这两个例子表明，研究的背景和目标人群的变化以及测量和实验条件的变化可能会对结果产生明显影响。这些例子还突出表明，遵守原方案的基本原则对于有效复制至关重要。如果没有这一点，就很难知道复制失败是因为该方案不起作用，还是因为它对特定人群或在特定条件下不起作用。因此，值得复制这些研究，同时保持对干预的高保真度，同时按照Wiliam（2022）的建议改变使用的人群和仪器。与这些努力相关的是关于科学在教育中的作用及其应采取的形式的问题。随机对照试验是一种严格的调查方法，通常被认为是科学研究的黄金标准。然而，有一些与它们相关的警告，以及关于它们的好处和局限性的持续辩论（康诺利

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Replication study in education

The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Educational Research and Evaluation EDUCATION & EDUCATIONAL RESEARCH-

CiteScore

3.00

自引率

0.00%

发文量

期刊介绍： International, comparative and multidisciplinary in scope, Educational Research and Evaluation (ERE) publishes original, peer-reviewed academic articles dealing with research on issues of worldwide relevance in educational practice. The aim of the journal is to increase understanding of learning in pre-primary, primary, high school, college, university and adult education, and to contribute to the improvement of educational processes and outcomes. The journal seeks to promote cross-national and international comparative educational research by publishing findings relevant to the scholarly community, as well as to practitioners and others interested in education. The scope of the journal is deliberately broad in terms of both topics covered and disciplinary perspective.