Amanda Swee-Ching Tan , Farhan Ali , Chiew Lim Lee , Kenneth K. Poon
{"title":"Predicting literacy intervention responsiveness using semi-supervised machine learning","authors":"Amanda Swee-Ching Tan , Farhan Ali , Chiew Lim Lee , Kenneth K. Poon","doi":"10.1016/j.ridd.2025.105090","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>There is pervasive non-responsiveness to systematic phonics interventions which have furthermore tended to focus on near-transfer outcomes related to phonology. There is a need to predict intervention responsiveness related to far transfer outcomes such as literacy-relevant word reading and spelling. Furthermore, there is potential for the use of advanced machine learning to maximize predictive power.</div></div><div><h3>Aims</h3><div>This study aims to longitudinally predict systematic phonics intervention using machine learning models.</div></div><div><h3>Method</h3><div>The sample included children with special educational needs (M = 98.08 months, N = 838) who either received long-term intervention (average duration of 33.62 months) (labeled data) or only had baseline data without intervention (unlabeled data). We applied 12 semi-supervised learning models learned from the mix of labeled and unlabeled data to predict intervention responsiveness outcomes of word reading and spelling. Predictors were background information, domain-general cognitive abilities, and language-related achievement scores, with expanded predictors consisting of differences among these predictors.</div></div><div><h3>Results</h3><div>Amongst 12 models developed, Random Forest and Gaussian Naïve Bayes models achieved the highest F1 score of 0.7 in the test set, supported by the incorporation of unlabeled data and expanded predictors. The top predictors were related to verbal comprehension, visual memory, and verbal working memory.</div></div><div><h3>Conclusions</h3><div>We identified important predictors of intervention responsiveness and showed the promise of machine learning models with implications on the allocation of resources, mitigation of risk of failure, and tailoring of interventions.</div></div>","PeriodicalId":51351,"journal":{"name":"Research in Developmental Disabilities","volume":"165 ","pages":"Article 105090"},"PeriodicalIF":2.6000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research in Developmental Disabilities","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S089142222500174X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SPECIAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background
There is pervasive non-responsiveness to systematic phonics interventions which have furthermore tended to focus on near-transfer outcomes related to phonology. There is a need to predict intervention responsiveness related to far transfer outcomes such as literacy-relevant word reading and spelling. Furthermore, there is potential for the use of advanced machine learning to maximize predictive power.
Aims
This study aims to longitudinally predict systematic phonics intervention using machine learning models.
Method
The sample included children with special educational needs (M = 98.08 months, N = 838) who either received long-term intervention (average duration of 33.62 months) (labeled data) or only had baseline data without intervention (unlabeled data). We applied 12 semi-supervised learning models learned from the mix of labeled and unlabeled data to predict intervention responsiveness outcomes of word reading and spelling. Predictors were background information, domain-general cognitive abilities, and language-related achievement scores, with expanded predictors consisting of differences among these predictors.
Results
Amongst 12 models developed, Random Forest and Gaussian Naïve Bayes models achieved the highest F1 score of 0.7 in the test set, supported by the incorporation of unlabeled data and expanded predictors. The top predictors were related to verbal comprehension, visual memory, and verbal working memory.
Conclusions
We identified important predictors of intervention responsiveness and showed the promise of machine learning models with implications on the allocation of resources, mitigation of risk of failure, and tailoring of interventions.
期刊介绍:
Research In Developmental Disabilities is aimed at publishing original research of an interdisciplinary nature that has a direct bearing on the remediation of problems associated with developmental disabilities. Manuscripts will be solicited throughout the world. Articles will be primarily empirical studies, although an occasional position paper or review will be accepted. The aim of the journal will be to publish articles on all aspects of research with the developmentally disabled, with any methodologically sound approach being acceptable.