Using linked Hospital Episode Statistics data to aid the handling of non-response and restore sample representativeness in the 1958 National Child Development Study.

IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES
Nasir Rajah, L. Calderwood, B. D. De Stavola, K. Harron, G. Ploubidis, R. Silverwood
{"title":"Using linked Hospital Episode Statistics data to aid the handling of non-response and restore sample representativeness in the 1958 National Child Development Study.","authors":"Nasir Rajah, L. Calderwood, B. D. De Stavola, K. Harron, G. Ploubidis, R. Silverwood","doi":"10.23889/ijpds.v7i3.1997","DOIUrl":null,"url":null,"abstract":"ObjectivesThere is growing interest in whether linked administrative data have the potential to aid analyses subject to missing data in cohort studies. We aimed to identify predictors of cohort non-response in linked administrative data and examine whether inclusion of these variables in principled methods for missing data handling can help restore sample representativeness. \nApproachUsing linked 1958 National Child Development Study (NCDS) and Hospital Episode Statistics (HES) data, we applied a multi-stage data-driven approach to identify HES variable which are predictive of non-response at the age 55 sweep of NCDS. We then included these variables as auxiliary variables in multiple imputation (MI) analyses to see if they helped restore sample representativeness in terms of early life variables which were essentially fully observed in NCDS (mother’s husband’s social class at birth, cognitive ability at age 7) and relative to external population data (educational qualifications at age 55, marital status at age 55). \nResultsWe took as our starting point 57 variables derived from HES data based on the presence or number of different types of appointments/admissions, diagnostic codes and treatment codes. After application of our multi-stage data-driven approach we identified five HES variables that were predictive of non-response at age 55 in NCDS. For example, cohort members who had been treated for adult mental illness were almost 3 times as likely to be non-respondents (risk ratio 2.81; 95% confidence interval 2.05, 3.86). Inclusion of these variables in MI analyses did help restore sample representativeness. However, there was no additional gain in sample representativeness relative to analyses using only previously identified survey predictors of non-response (i.e. NCDS rather than HES variables). \nConclusionIn our applications, inclusion of HES predictors of NCDS non-response in analyses did not improve sample representativeness beyond that possible using survey variables alone. Whilst this finding may not extend to other analyses or NCDS sweeps, it highlights the utility of survey variables in handling non-response.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v7i3.1997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

ObjectivesThere is growing interest in whether linked administrative data have the potential to aid analyses subject to missing data in cohort studies. We aimed to identify predictors of cohort non-response in linked administrative data and examine whether inclusion of these variables in principled methods for missing data handling can help restore sample representativeness. ApproachUsing linked 1958 National Child Development Study (NCDS) and Hospital Episode Statistics (HES) data, we applied a multi-stage data-driven approach to identify HES variable which are predictive of non-response at the age 55 sweep of NCDS. We then included these variables as auxiliary variables in multiple imputation (MI) analyses to see if they helped restore sample representativeness in terms of early life variables which were essentially fully observed in NCDS (mother’s husband’s social class at birth, cognitive ability at age 7) and relative to external population data (educational qualifications at age 55, marital status at age 55). ResultsWe took as our starting point 57 variables derived from HES data based on the presence or number of different types of appointments/admissions, diagnostic codes and treatment codes. After application of our multi-stage data-driven approach we identified five HES variables that were predictive of non-response at age 55 in NCDS. For example, cohort members who had been treated for adult mental illness were almost 3 times as likely to be non-respondents (risk ratio 2.81; 95% confidence interval 2.05, 3.86). Inclusion of these variables in MI analyses did help restore sample representativeness. However, there was no additional gain in sample representativeness relative to analyses using only previously identified survey predictors of non-response (i.e. NCDS rather than HES variables). ConclusionIn our applications, inclusion of HES predictors of NCDS non-response in analyses did not improve sample representativeness beyond that possible using survey variables alone. Whilst this finding may not extend to other analyses or NCDS sweeps, it highlights the utility of survey variables in handling non-response.
在1958年的国家儿童发展研究中,使用关联的医院事件统计数据来帮助处理无反应和恢复样本代表性。
目的人们越来越感兴趣的是,关联的管理数据是否有可能帮助队列研究中缺失数据的分析。我们旨在确定关联管理数据中队列无反应的预测因素,并检查将这些变量纳入缺失数据处理的原则性方法是否有助于恢复样本代表性。方法使用1958年国家儿童发展研究(NCDS)和医院事件统计(HES)的相关数据,我们应用多阶段数据驱动的方法来确定HES变量,这些变量可以预测55岁时NCDS的无反应。然后,我们将这些变量作为辅助变量纳入多重插补(MI)分析,以了解它们是否有助于恢复早期生活变量的样本代表性,这些变量在NCDS(母亲丈夫出生时的社会阶层、7岁时的认知能力)中基本上得到了充分观察,并与外部人口数据相关(55岁时学历,55岁时婚姻状况)。结果我们以HES数据中的57个变量为出发点,这些变量基于不同类型的预约/入院的存在或数量、诊断代码和治疗代码。在应用我们的多阶段数据驱动方法后,我们确定了五个HES变量,这些变量可以预测55岁时NCDS的无反应。例如,接受过成人精神疾病治疗的队列成员几乎是非受访者的3倍(风险比2.81;95%置信区间2.05,3.86)。将这些变量纳入MI分析确实有助于恢复样本的代表性。然而,与仅使用先前确定的无反应调查预测因素(即NCDS而非HES变量)的分析相比,样本代表性没有额外的增加。结论在我们的应用中,在分析中纳入NCDS无反应的HES预测因子并不能提高样本的代表性,超过单独使用调查变量的可能性。虽然这一发现可能不会扩展到其他分析或NCDS扫描,但它强调了调查变量在处理无响应方面的效用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
386
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信