Predicting literacy intervention responsiveness using semi-supervised machine learning

IF 2.6 2区医学 Q1 EDUCATION, SPECIAL

Research in Developmental Disabilities Pub Date : 2025-08-21 DOI:10.1016/j.ridd.2025.105090

Amanda Swee-Ching Tan , Farhan Ali , Chiew Lim Lee , Kenneth K. Poon

{"title":"Predicting literacy intervention responsiveness using semi-supervised machine learning","authors":"Amanda Swee-Ching Tan , Farhan Ali , Chiew Lim Lee , Kenneth K. Poon","doi":"10.1016/j.ridd.2025.105090","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>There is pervasive non-responsiveness to systematic phonics interventions which have furthermore tended to focus on near-transfer outcomes related to phonology. There is a need to predict intervention responsiveness related to far transfer outcomes such as literacy-relevant word reading and spelling. Furthermore, there is potential for the use of advanced machine learning to maximize predictive power.</div></div><div><h3>Aims</h3><div>This study aims to longitudinally predict systematic phonics intervention using machine learning models.</div></div><div><h3>Method</h3><div>The sample included children with special educational needs (M = 98.08 months, N = 838) who either received long-term intervention (average duration of 33.62 months) (labeled data) or only had baseline data without intervention (unlabeled data). We applied 12 semi-supervised learning models learned from the mix of labeled and unlabeled data to predict intervention responsiveness outcomes of word reading and spelling. Predictors were background information, domain-general cognitive abilities, and language-related achievement scores, with expanded predictors consisting of differences among these predictors.</div></div><div><h3>Results</h3><div>Amongst 12 models developed, Random Forest and Gaussian Naïve Bayes models achieved the highest F1 score of 0.7 in the test set, supported by the incorporation of unlabeled data and expanded predictors. The top predictors were related to verbal comprehension, visual memory, and verbal working memory.</div></div><div><h3>Conclusions</h3><div>We identified important predictors of intervention responsiveness and showed the promise of machine learning models with implications on the allocation of resources, mitigation of risk of failure, and tailoring of interventions.</div></div>","PeriodicalId":51351,"journal":{"name":"Research in Developmental Disabilities","volume":"165 ","pages":"Article 105090"},"PeriodicalIF":2.6000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in Developmental Disabilities","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089142222500174X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SPECIAL","Score":null,"Total":0}

引用次数: 0

Abstract

Background

There is pervasive non-responsiveness to systematic phonics interventions which have furthermore tended to focus on near-transfer outcomes related to phonology. There is a need to predict intervention responsiveness related to far transfer outcomes such as literacy-relevant word reading and spelling. Furthermore, there is potential for the use of advanced machine learning to maximize predictive power.

Aims

This study aims to longitudinally predict systematic phonics intervention using machine learning models.

Method

The sample included children with special educational needs (M = 98.08 months, N = 838) who either received long-term intervention (average duration of 33.62 months) (labeled data) or only had baseline data without intervention (unlabeled data). We applied 12 semi-supervised learning models learned from the mix of labeled and unlabeled data to predict intervention responsiveness outcomes of word reading and spelling. Predictors were background information, domain-general cognitive abilities, and language-related achievement scores, with expanded predictors consisting of differences among these predictors.

Results

Amongst 12 models developed, Random Forest and Gaussian Naïve Bayes models achieved the highest F1 score of 0.7 in the test set, supported by the incorporation of unlabeled data and expanded predictors. The top predictors were related to verbal comprehension, visual memory, and verbal working memory.

Conclusions

We identified important predictors of intervention responsiveness and showed the promise of machine learning models with implications on the allocation of resources, mitigation of risk of failure, and tailoring of interventions.

查看原文本刊更多论文

使用半监督机器学习预测读写干预反应

背景：对系统语音干预的无反应性普遍存在，这些干预进一步倾向于关注与语音相关的近迁移结果。有必要预测与远迁移结果相关的干预反应，如读写能力相关的单词阅读和拼写。此外，有可能使用先进的机器学习来最大化预测能力。本研究旨在利用机器学习模型对系统语音干预进行纵向预测。方法纳入有特殊教育需要的儿童（M = 98.08个月，N = 838），这些儿童接受了长期干预（平均持续时间33.62个月）（标记数据）或只有基线数据（未标记数据）。我们应用了12个半监督学习模型来预测单词阅读和拼写的干预反应结果，这些模型来自标记和未标记数据的混合。预测因子是背景信息、领域认知能力和语言相关成就分数，扩展预测因子由这些预测因子之间的差异组成。结果在12个模型中，随机森林和高斯Naïve贝叶斯模型在未标记数据和扩展预测因子的支持下，在测试集中获得了最高的F1得分0.7。最重要的预测因素与言语理解、视觉记忆和言语工作记忆有关。我们确定了干预响应的重要预测因素，并展示了机器学习模型在资源分配、降低失败风险和调整干预措施方面的前景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Research in Developmental Disabilities Multiple-

CiteScore

5.50

自引率

6.50%

发文量

178

期刊介绍： Research In Developmental Disabilities is aimed at publishing original research of an interdisciplinary nature that has a direct bearing on the remediation of problems associated with developmental disabilities. Manuscripts will be solicited throughout the world. Articles will be primarily empirical studies, although an occasional position paper or review will be accepted. The aim of the journal will be to publish articles on all aspects of research with the developmentally disabled, with any methodologically sound approach being acceptable.