Caitlin Hemlock, Laura H Kwong, Lia C H Fernald, Alan E Hubbard, John M Colford, Fahmida Tofail, Md Mahbubur Rahman, Sarker Parvez, Stephen P Luby, Andrew N Mertens
{"title":"利用机器学习识别在以人口为基础的水、卫生、洗手和营养干预中预期收益最高的子群体。","authors":"Caitlin Hemlock, Laura H Kwong, Lia C H Fernald, Alan E Hubbard, John M Colford, Fahmida Tofail, Md Mahbubur Rahman, Sarker Parvez, Stephen P Luby, Andrew N Mertens","doi":"10.1101/2025.06.17.25329796","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Understanding who benefits most from investments in water, sanitation, and hygiene (WaSH) interventions can elucidate causal pathways, uncover complex interactions between population characteristics and interventions, and inform targeted implementation. We applied machine learning to identify and describe households of children that benefited most from WaSH and nutrition interventions.</p><p><strong>Methods: </strong>We used causal forests and baseline characteristics of pregnant women enroled in a trial in Bangladesh (2013-2015) to test for heterogenous treatment effects of the primary trial outcomes at two years (length-for-age Z-score [LAZ-score] and diarrhoea prevalence) and one secondary outcome (child development [EASQ Z-score]) for each treatment-outcome combination. We split households into three groups based on predicted treatment effect magnitude and compared characteristics of those that benefitted the most (Tercile 3) versus the least (Tercile 1).</p><p><strong>Results: </strong>Heterogeneity was detected in the effect of Sanitation on EASQ Z-score, compared to Control; children in Tercile 3 were estimated to gain 0.51 SD (95% CI: 0.35, 0.67) whereas children in Tercile 1 were estimated to have no benefit. At baseline, households of children in Tercile 3 were more likely to report that chickens always entered the house (85% vs. 4%) and had animal feces observed in the child's play area (84% vs. 18%) when compared with Tercile 1. Tercile 3 households also owned less land and assets and lived further from Dhaka, any population center, or a market. We did not detect heterogeneity for any other treatment-outcome comparison.</p><p><strong>Conclusions: </strong>We did not detect heterogeneity in any treatment arms for the outcomes of diarrhoea or LAZ-score, showing that children from all backgrounds benefit from effective interventions equally based on household characteristics. We found heterogeneity in the effect of receiving sanitation improvements on child development, where poorer households located in more remote areas and potentially with higher levels of animal fecal contamination had the highest expected benefit.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204266/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using machine learning to identify subgroups with the highest expected benefit in a population-based water, sanitation, handwashing, and nutrition intervention.\",\"authors\":\"Caitlin Hemlock, Laura H Kwong, Lia C H Fernald, Alan E Hubbard, John M Colford, Fahmida Tofail, Md Mahbubur Rahman, Sarker Parvez, Stephen P Luby, Andrew N Mertens\",\"doi\":\"10.1101/2025.06.17.25329796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Understanding who benefits most from investments in water, sanitation, and hygiene (WaSH) interventions can elucidate causal pathways, uncover complex interactions between population characteristics and interventions, and inform targeted implementation. We applied machine learning to identify and describe households of children that benefited most from WaSH and nutrition interventions.</p><p><strong>Methods: </strong>We used causal forests and baseline characteristics of pregnant women enroled in a trial in Bangladesh (2013-2015) to test for heterogenous treatment effects of the primary trial outcomes at two years (length-for-age Z-score [LAZ-score] and diarrhoea prevalence) and one secondary outcome (child development [EASQ Z-score]) for each treatment-outcome combination. We split households into three groups based on predicted treatment effect magnitude and compared characteristics of those that benefitted the most (Tercile 3) versus the least (Tercile 1).</p><p><strong>Results: </strong>Heterogeneity was detected in the effect of Sanitation on EASQ Z-score, compared to Control; children in Tercile 3 were estimated to gain 0.51 SD (95% CI: 0.35, 0.67) whereas children in Tercile 1 were estimated to have no benefit. At baseline, households of children in Tercile 3 were more likely to report that chickens always entered the house (85% vs. 4%) and had animal feces observed in the child's play area (84% vs. 18%) when compared with Tercile 1. Tercile 3 households also owned less land and assets and lived further from Dhaka, any population center, or a market. We did not detect heterogeneity for any other treatment-outcome comparison.</p><p><strong>Conclusions: </strong>We did not detect heterogeneity in any treatment arms for the outcomes of diarrhoea or LAZ-score, showing that children from all backgrounds benefit from effective interventions equally based on household characteristics. We found heterogeneity in the effect of receiving sanitation improvements on child development, where poorer households located in more remote areas and potentially with higher levels of animal fecal contamination had the highest expected benefit.</p>\",\"PeriodicalId\":94281,\"journal\":{\"name\":\"medRxiv : the preprint server for health sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12204266/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv : the preprint server for health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.06.17.25329796\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.06.17.25329796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using machine learning to identify subgroups with the highest expected benefit in a population-based water, sanitation, handwashing, and nutrition intervention.
Background: Understanding who benefits most from investments in water, sanitation, and hygiene (WaSH) interventions can elucidate causal pathways, uncover complex interactions between population characteristics and interventions, and inform targeted implementation. We applied machine learning to identify and describe households of children that benefited most from WaSH and nutrition interventions.
Methods: We used causal forests and baseline characteristics of pregnant women enroled in a trial in Bangladesh (2013-2015) to test for heterogenous treatment effects of the primary trial outcomes at two years (length-for-age Z-score [LAZ-score] and diarrhoea prevalence) and one secondary outcome (child development [EASQ Z-score]) for each treatment-outcome combination. We split households into three groups based on predicted treatment effect magnitude and compared characteristics of those that benefitted the most (Tercile 3) versus the least (Tercile 1).
Results: Heterogeneity was detected in the effect of Sanitation on EASQ Z-score, compared to Control; children in Tercile 3 were estimated to gain 0.51 SD (95% CI: 0.35, 0.67) whereas children in Tercile 1 were estimated to have no benefit. At baseline, households of children in Tercile 3 were more likely to report that chickens always entered the house (85% vs. 4%) and had animal feces observed in the child's play area (84% vs. 18%) when compared with Tercile 1. Tercile 3 households also owned less land and assets and lived further from Dhaka, any population center, or a market. We did not detect heterogeneity for any other treatment-outcome comparison.
Conclusions: We did not detect heterogeneity in any treatment arms for the outcomes of diarrhoea or LAZ-score, showing that children from all backgrounds benefit from effective interventions equally based on household characteristics. We found heterogeneity in the effect of receiving sanitation improvements on child development, where poorer households located in more remote areas and potentially with higher levels of animal fecal contamination had the highest expected benefit.