{"title":"Data resource profile: a guide for constructing school-to-work sequence analysis trajectories using the longitudinal education outcomes (LEO) data.","authors":"Shivani Sickotra","doi":"10.23889/ijpds.v8i6.2953","DOIUrl":"10.23889/ijpds.v8i6.2953","url":null,"abstract":"<p><strong>Introduction: </strong>Sequence analysis is a powerful methodology for examining longitudinal school-to-work trajectories. Despite its growing use, there is limited guidance on preparing suitable datasets. This resource details the creation of a dataset specifically designed for sequence analysis, capturing yearly education and employment activity states for 556,182 individuals from England's 2010/11 school-leaver cohort.</p><p><strong>Methods: </strong>The dataset was constructed using the Department for Education's Longitudinal Education Outcomes (LEO) data. SQL was used to extract relevant variables, and data linkage and preprocessing was performed using R. Data processing was tailored to sequence analysis, including reducing the number of activity states and applying a hierarchy to integrate education and employment data.</p><p><strong>Results: </strong>The resulting dataset spans activities from the first non-compulsory state in 2011/12 until 2018/19, tracking trajectories from ages 16/17 to 23/24. The dataset was designed with the ability to subset school-leavers by their initial Combined Authority residence to aid in regional analysis of school-to-work trajectories. Individual-level socio-demographic characteristics that can be linked to the longitudinal activity histories were also built, alongside longitudinal geographic locations and employment earnings data. Additionally, the limitations of the developed data are discussed.</p><p><strong>Conclusion: </strong>This resource provides crucial guidance for researchers and practitioners who may require experience preparing input datasets for sequence analysis, addressing the current gap in available resources. By offering step-by-step instructions and shared code, it empowers users to recreate or adapt the dataset for their specific research needs. Its ability to subset by region further supports localised and comparative studies of school-to-work trajectories, making it a valuable tool for advancing existing research. The LEO data can be accessed by application through the Office for National Statistics Secure Research Service.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 6","pages":"2953"},"PeriodicalIF":1.6,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11935648/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143711548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caitlin Gray, Helen Leonard, Matthew N Cooper, Dheeraj Rai, Emma J Glasson
{"title":"The application of population data linkage to capture sibling health outcomes among children and young adults with neurodevelopmental conditions. A scoping review.","authors":"Caitlin Gray, Helen Leonard, Matthew N Cooper, Dheeraj Rai, Emma J Glasson","doi":"10.23889/ijpds.v10i1.2413","DOIUrl":"10.23889/ijpds.v10i1.2413","url":null,"abstract":"<p><strong>Introduction: </strong>Siblings of children with neurodevelopmental conditions have unique experiences and challenges related to their sibling role. Some develop mental health concerns as measured by self-reported surveys or parent report. Few data are available at the population level, owing to difficulties capturing wide-scale health data for siblings. Data linkage is a technique that can facilitate such research.</p><p><strong>Objective: </strong>To explore the application of population data linkage as a research method to capture health outcomes of siblings of children with neurodevelopmental conditions.</p><p><strong>Inclusion criteria: </strong>Peer reviewed papers that captured health outcomes for siblings of children and young adults with neurodevelopmental conditions using population data linkage.</p><p><strong>Methods: </strong>JBI Scoping review methods were followed. Papers were searched within CINAHL, Ovid, Scopus, and Web of Science from 2000 to 2024 using search terms relating to 'data linkage' 'neurodevelopmental conditions' 'siblings' and 'health outcomes'.</p><p><strong>Results: </strong>The final data extraction included 31 papers. The neurodevelopmental conditions of index children were autism, attention deficit hyperactivity disorder, intellectual disability, cerebral palsy and developmental delay. The mean follow-up time was 31 years, and the majority of studies originated from Scandinavia. Sibling health outcomes observed were psychiatric diagnoses, self-harm and suicide, other neurodevelopmental conditions, and medical conditions such as atopic disease, cancer and obesity.</p><p><strong>Conclusion: </strong>Data linkage can help capture sibling health outcomes quickly across large cohorts with a range of neurodevelopmental conditions. Future research could be enhanced by focusing on siblings as the primary group of interest, increased integration of genealogical data, and comparisons between diagnostic groups and severity levels. Adoption of established rigorous reporting methods will increase the replicability of this type of research, and provide a stronger evidence-base from which to inform sibling supports.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2413"},"PeriodicalIF":1.6,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11923734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143671252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Joseph Lam, Mario Cortina-Borja, Robert Aldridge, Ruth Blackburn, Katie Harron
{"title":"Data Note: Alternative Name Encodings - Using Jyutping or Pinyin as tonal representations of Chinese names for data linkage.","authors":"Joseph Lam, Mario Cortina-Borja, Robert Aldridge, Ruth Blackburn, Katie Harron","doi":"10.23889/ijpds.v8i5.2935","DOIUrl":"10.23889/ijpds.v8i5.2935","url":null,"abstract":"<p><p>Accurate data linkage across large administrative databases is crucial for addressing complex research and policy questions, yet linkage errors-stemming from inconsistent name representations-can introduce biases, predominantly for names not given in English. This data note examines the impact of romanisation on linkage accuracy, focusing on Chinese names and comparing standardised systems (Jyutping and Pinyin) with the non-standardised Hong Kong Government Cantonese Romanisation (HKG-romanisation). We identify three primary issues: language-specific variations in romanisation, the loss of tonal information inherent to tonal languages, and discrepancies in name order conventions. Using a dataset of 771 Hong Kong student names, our analysis reveals that standardised romanisation systems enhance the uniqueness and consistency of name representations, thereby improving linkage precision and recall compared to HKG-romanisation. Specifically, Jyutping and Pinyin achieved over 95% recall in blocking strategies, whereas HKG-romanisation only reached 68.8%. Incorporating tonal information further improved recall. These findings underscore the necessity of adopting standardised, tone-sensitive romanisation systems and flexible database designs to reduce linkage errors and promote data equity for under-represented groups. We advocate for the implementation of phonetic encodings in databases, alongside language-specific pre-processing protocols, to ensure more inclusive and accurate data linkage processes.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 5","pages":"2935"},"PeriodicalIF":1.6,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11897931/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143616678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael O Budu, Katherine W Kooij, Kate Heath, Taylor McLinden, Claudette Cardinal, Scott D Emerson, Paul Sereda, Jason Trigg, Jenny Li, Erin Ding, Mark W Hull, Kate Salters, Viviane D Lima, Rolando Barrios, Julio S G Montaner, Robert S Hogg
{"title":"Cohort Profile Update: Reflecting back and looking ahead: Updating the Comparative Outcomes and Service Utilization Trends (COAST) Study to include 28 years of linked data from people with and without HIV in British Columbia, Canada.","authors":"Michael O Budu, Katherine W Kooij, Kate Heath, Taylor McLinden, Claudette Cardinal, Scott D Emerson, Paul Sereda, Jason Trigg, Jenny Li, Erin Ding, Mark W Hull, Kate Salters, Viviane D Lima, Rolando Barrios, Julio S G Montaner, Robert S Hogg","doi":"10.23889/ijpds.v10i1.2496","DOIUrl":"10.23889/ijpds.v10i1.2496","url":null,"abstract":"<p><strong>Introduction: </strong>The Comparative Outcomes and Service Utilization Trends (COAST) study compares health outcomes among People With HIV (PWH) and People Without HIV (PWoH) in British Columbia (BC), Canada. The cohort was recently updated to include persons diagnosed with HIV after March 31, 2013, and expanded to broaden research applications.</p><p><strong>Methods: </strong>COAST includes PWH and a 10% random sample of the general population without HIV, all aged ≥19. Our study links an HIV registry to healthcare practitioner billing, hospital and emergency department attendance data, prescription drug dispensations, and a cancer registry. Our cohort update included new sampling strategies, adding data on emergency department visits not previously captured, and extending our follow-up period to 28 years (from 1992 to 2020). COAST now includes 17,119 PWH and 615,264 PWoH.</p><p><strong>Findings to date: </strong>COAST has contributed to our understanding of combination antiretroviral therapy (ART) use, health service utilization, chronic diseases, mental health and substance use disorders, and mortality among PWH in BC. Key findings include earlier age at diagnosis of certain chronic conditions, a higher incidence of mood disorders among PWH, and noteworthy shifts in causes of death among PWH on ART. The updated cohort will provide insights into the changing nature of the population living with HIV in BC and serves as a novel foundation for further research.</p><p><strong>Future plans: </strong>To explore and extend knowledge of the evolving trends among people living and aging with HIV in BC, regular data linkage updates and the inclusion of additional datasets are scheduled every two years.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2496"},"PeriodicalIF":1.6,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11922098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claudia Medina Coeli, Rosa Maria Soares Madeira Domingues, Lana Meijinhos, Daniela Medina Coeli Bastos, Rejane Sobrino Pinheiro, Valeria Saraceni, Marcos Augusto Bastos Dias, Natália Santana Paiva, Kenneth Rochel de Camargo
{"title":"Using a deterministic matching computer routine to identify hospital episodes in a Brazilian de-identified administrative database for the analysis of obstetrics hospitalisations.","authors":"Claudia Medina Coeli, Rosa Maria Soares Madeira Domingues, Lana Meijinhos, Daniela Medina Coeli Bastos, Rejane Sobrino Pinheiro, Valeria Saraceni, Marcos Augusto Bastos Dias, Natália Santana Paiva, Kenneth Rochel de Camargo","doi":"10.23889/ijpds.v10i1.2467","DOIUrl":"10.23889/ijpds.v10i1.2467","url":null,"abstract":"<p><strong>Introduction: </strong>The absence of a unique patient identifier in the Brazilian hospital administrative database prevents the identification of hospital episodes with multiple hospitalisations of the same patient.</p><p><strong>Objectives: </strong>This study aims to evaluate the information gain by using a computer routine to identify acute Obstetrics hospital episodes and its impact on assessing marks of case severity.</p><p><strong>Methods: </strong>The data source was a de-identified Brazilian hospital administrative database from 2017 to 2020, including hospitalisations records of women of reproductive age (10 to 49 years old) for treating acute conditions (N=16,087,490). We processed this database by combining C++ and Python routines to create a hospital episodes database. From the latter, we selected obstetrics hospital episodes from 2018 to 2019 (N = 4,926,877). We compared selected characteristics of the hospital episodes according to their type (multiple vs single records per episode), testing for differences using effect size measures. We compared relative differences in case severity marks when using the hospital episode as the unit of analysis to that of isolated hospitalisations (N = 5,018,350).</p><p><strong>Results: </strong>Compared to single-record episodes, multiple-records episodes had longer length of stay, higher amount reimbursed, and lower proportion of discharge alive. When comparing isolated hospitalisations to hospital episodes analysis, we observed an increase in all case severity indicators, especially for hospital deaths, with an increment of 13.15%. The computer routine decreased the hospital admissions with a reason for hospital discharge that did not indicate the outcome (hospital stay or inter-hospital transfer) from 2.29% to 0.73.</p><p><strong>Conclusions: </strong>The deterministic matching computer routine proved valuable for identifying records that refer to the same hospital episode, which improved the assessment of severe cases.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2467"},"PeriodicalIF":1.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11874899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jayu Jung, Sarah Cattan, Claire Powell, Jane Barlow, Mengyun Liu, Amanda Clery, Louise Mc Grath-Lone, Catherine Bunting, Jenny Woodman
{"title":"Early child development in England: cross-sectional analysis of ASQ<sup>®</sup>-3 records from the 2-2½-year universal health visiting review using national administrative data (Community Service Dataset, CSDS).","authors":"Jayu Jung, Sarah Cattan, Claire Powell, Jane Barlow, Mengyun Liu, Amanda Clery, Louise Mc Grath-Lone, Catherine Bunting, Jenny Woodman","doi":"10.23889/ijpds.v9i2.2459","DOIUrl":"10.23889/ijpds.v9i2.2459","url":null,"abstract":"<p><strong>Introduction: </strong>The Ages & Stages Questionnaire 3rd Edition (ASQ<sup>®</sup>-3) is a tool to measure developmental delay for children aged between 1 - 66 months originally developed in the United States. This measure has been collected in England since 2015 as a part of mandated 2-2½-year health visiting reviews and collated nationally in the Community Services Dataset (CSDS). CSDS is known to be incomplete and to-date there have not been any published analyses of ASQ<sup>®</sup>-3 held within CSDS.</p><p><strong>Objectives: </strong>This study aimed to a) identify a subset of complete child development data for children aged two in England using ASQ<sup>®</sup>-3 data in CSDS between 2018/19-2020/21; b) use this subset of data to analyse child development age 2-2½-years in England.</p><p><strong>Methods: </strong>This study compared counts of ASQ<sup>®</sup>-3 records in CSDS by local authority and financial quarter against national, publicly available Health Visitor Service Delivery Metrics (HVSDM) to identify local authorities with complete ASQ<sup>®</sup>-3 records in CSDS. This study described child development in this subset of the data using both a binary cut-off of whether a child reached expected level of development and the continuous ASQ<sup>®</sup>-3 score.</p><p><strong>Results: </strong>Among the 226,505 children from 64 local authorities in the sample with complete ASQ<sup>®</sup>-3 data, 86.2% met expected level of development. Children from the most deprived neighbourhoods (82.6%), children recorded as Black (78.9%), and boys (81.7%) were less likely to meet expected level of development.</p><p><strong>Conclusions: </strong>To fully understand early child development across England, the completeness of ASQ<sup>®</sup>-3 data in the CSDS requires improvement. Second, in order to interpret the national CSDS data on child development, ASQ<sup>®</sup>-3 should be standardised and validated in an English context with attention paid to implementation and subsequent referral and support pathways. Our study provides a minimum estimate of children needing developmental support (13.8%), with many more children likely to be experiencing moderate or mild delay but not identified by the ASQ<sup>®</sup>-3 cut-offs for expected development.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"9 2","pages":"2459"},"PeriodicalIF":1.6,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934300/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143711552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangmei Li, Jennifer J Kurinczuk, Fiona Alderdice, Maria A Quigley, Oliver Rivero-Arias, Julia Sanders, Sara Kenyon, Dimitrios Siassakos, Nikesh Parekh, Suresha De Almeida, Claire Carson
{"title":"Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.","authors":"Yangmei Li, Jennifer J Kurinczuk, Fiona Alderdice, Maria A Quigley, Oliver Rivero-Arias, Julia Sanders, Sara Kenyon, Dimitrios Siassakos, Nikesh Parekh, Suresha De Almeida, Claire Carson","doi":"10.23889/ijpds.v10i1.2471","DOIUrl":"10.23889/ijpds.v10i1.2471","url":null,"abstract":"<p><strong>Introduction: </strong>Electronic health records are invaluable for pregnancy-related studies. The Clinical Practice Research Datalink (CPRD) Pregnancy Register (PR) identifies pregnancies in primary care records, including uncertain cases.</p><p><strong>Objectives: </strong>This paper outlines a method to reduce uncertainty in identifying pregnancies within CPRD GOLD PR data, exemplified through a study investigating the provision of pre-pregnancy care.</p><p><strong>Methods: </strong>We used CPRD Mother Baby Link (MBL) and Maternity Hospital Episode Statistics (HES) to clean and augment the CPRD PR data. The study included all women aged 18-48yrs, registered at an English GP practice within CPRD on 01/01/2017, with a year of prior registration and eligibility for hospital data linkage. We developed a cleaning and combining algorithm and further applied strict data quality criteria to form three populations: 'as provided', 'derived' (using our algorithm) and 'strictly derived' (with stricter data quality criteria). We compared characteristics and outcomes across these populations, examining potential biases in effect estimates using the 'as provided' population.</p><p><strong>Results: </strong>Our algorithm added 22,270 (~7%) pregnancies from hospital data to the CPRD PR (1997-2021), eliminated conflicting pregnancies and pregnancies with unknown outcomes, and minimised potentially non-contemporaneous records of past pregnancies or partial records of pregnancies.For all pregnancies across women's reproductive history, in the 'strictly derived' population, characterised by better data quality, a higher prevalence of pre-existing medical conditions and increased pre-pregnancy care were observed. In this dataset, recording of both exposure and outcome was better, and the magnitude of the association between exposure and outcome was reduced compared to the 'as provided' population.</p><p><strong>Conclusion: </strong>PR data requires cleaning before use. This study presents a pragmatic and practical method to identify pregnancies using existing CPRD data and linked records, without needing additional data. Researchers should carefully consider their studies' specific requirements and may adapt our proposed methodology accordingly to align with their research questions.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2471"},"PeriodicalIF":1.6,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11874892/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robert T Maddison, Karen R Reed, Rebecca Cannings-John, Fiona Lugg-Widger, Thomas Stoneman, Sarah Anderson, Andrew E Fry
{"title":"Adapting historical clinical genetic test records for anonymised data linkage: obstacles and opportunities.","authors":"Robert T Maddison, Karen R Reed, Rebecca Cannings-John, Fiona Lugg-Widger, Thomas Stoneman, Sarah Anderson, Andrew E Fry","doi":"10.23889/ijpds.v8i5.2924","DOIUrl":"10.23889/ijpds.v8i5.2924","url":null,"abstract":"<p><strong>Introduction: </strong>Cystic fibrosis (CF) heterozygotes (also known as 'carriers') are people who have one mutated copy of the <i>CFTR</i> gene. Research into the health risks of CF carriers has been limited by a lack of large cohorts tested for CF carrier status, but routine clinical testing identifies CF carriers in the population. Such test records additionally contain large amounts of clinical information, making them a valuable research resource to not only identify CF carriers in the population but also to provide additional data not found elsewhere.</p><p><strong>Methods: </strong>Following governance approvals, we adapted 30 years worth of CF genetic testing records generated by the All-Wales Medical Genomics Service (AWMGS) and submitted them to the SAIL Databank for anonymised linkage.</p><p><strong>Results: </strong>Unexpected obstacles meant that a minimum amount of clinical information could be annotated ahead of linkage. The raw data were highly heterogeneous due to the records' longitudinal collection and clinical origins, making standardisation difficult. Moreover, the presence of unique identifiers in the clinical data violated the separation principle, requiring manual annotation to produce a cleaned dataset. Explicit identification of patients or their relatives throughout the records complicated split file anonymisation.</p><p><strong>Conclusion: </strong>Extracting useful information from historical clinical genetic test records is a significant challenge with technical and governance aspects. The mixing of unique identifiers with clinical data in heterogeneous, unstructured free text combined with a lack of automated tools meant that manual annotation was required to adhere to the separation principle. As such, only a minimum of the available clinical data was annotatable within the project timeline and mutually exclusive access to the identifiable and pseudonymised data meant that annotations could not later be validated. Future efforts to link clinical genetic test records for research must consider these challenges in their approach.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"8 5","pages":"2924"},"PeriodicalIF":1.6,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11922013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andy Boyd, Katharine M Evans, Emma L Turner, Robin Flaig, Jacqui Oakley, Kirsteen C Campbell, Richard Thomas, Stela McLachlan, Matthew Crane, Rebecca Whitehorn, Rachel Calkin, Abigail Hill, Samantha Berman, David Ford, Martin Tobin, David Porteous, Danielle F Gomes, Maria-Paz Garcia, Andrew Wong, Aida Sanchez, Chris Orton, Simon Thompson, John Gulliver, Kathryn Adams, Ellena Badrick, Chiara Batini, Michaela Benzeval, Susie Boatman, Gerome Breen, Shannon Bristow, Abigail Britten, Luke Bryant, Adam Butterworth, Archie Campbell, Sarah Chave, John Danesh, Jayati Das-Munshi, Karen Dennison, Emanuele Di Angelantonio, Thalia C Eley, Helen Fisher, Emla Fitzsimons, Alissa Goodman, Michael Gregg, Anna L Guyatt, Anna Hansell, Rebecca Harmston, Andy Heard, Morag Henderson, Rosie Hill, Szu-Chia Huang, Catherine John, Frank Kee, Nathalie Kingston, Jack Kneeshaw, Rashmi Kumar, Genevieve Lachance, Celestine Lockhart, Hazel Lockhart-Jones, Sarah Markham, Dan Mason, Bernadette McGuinness, Maisie McKenzie, Amy McMahon, Chelsea Mika Malouf, Mark Mumme, Charlotte Neville, Kate Northstone, Zoe Oldfield, Dara O'Neill, Manish Pareek, John Pickavance, Yasmin Rahman, Holly Reilly, Angela Scott, Deb Smith, Andrew Steptoe, Claire Steves, Cathie Sudlow, Gerald Sze, Nicholas L Timpson, Tapiwa Tungamirai, Laura Venn, Matthew Walker, Neil Walker, Nicolas Wareham, Aidan Watmuff, Tony Webb, Karen Williams, John Wright, Darioush Yarand, George B Ploubidis, John Macleod, Jonathan Ac Sterne, Nishi Chaturvedi
{"title":"UK Longitudinal Linkage Collaboration (UK LLC): The National Trusted Research Environment for Longitudinal Research.","authors":"Andy Boyd, Katharine M Evans, Emma L Turner, Robin Flaig, Jacqui Oakley, Kirsteen C Campbell, Richard Thomas, Stela McLachlan, Matthew Crane, Rebecca Whitehorn, Rachel Calkin, Abigail Hill, Samantha Berman, David Ford, Martin Tobin, David Porteous, Danielle F Gomes, Maria-Paz Garcia, Andrew Wong, Aida Sanchez, Chris Orton, Simon Thompson, John Gulliver, Kathryn Adams, Ellena Badrick, Chiara Batini, Michaela Benzeval, Susie Boatman, Gerome Breen, Shannon Bristow, Abigail Britten, Luke Bryant, Adam Butterworth, Archie Campbell, Sarah Chave, John Danesh, Jayati Das-Munshi, Karen Dennison, Emanuele Di Angelantonio, Thalia C Eley, Helen Fisher, Emla Fitzsimons, Alissa Goodman, Michael Gregg, Anna L Guyatt, Anna Hansell, Rebecca Harmston, Andy Heard, Morag Henderson, Rosie Hill, Szu-Chia Huang, Catherine John, Frank Kee, Nathalie Kingston, Jack Kneeshaw, Rashmi Kumar, Genevieve Lachance, Celestine Lockhart, Hazel Lockhart-Jones, Sarah Markham, Dan Mason, Bernadette McGuinness, Maisie McKenzie, Amy McMahon, Chelsea Mika Malouf, Mark Mumme, Charlotte Neville, Kate Northstone, Zoe Oldfield, Dara O'Neill, Manish Pareek, John Pickavance, Yasmin Rahman, Holly Reilly, Angela Scott, Deb Smith, Andrew Steptoe, Claire Steves, Cathie Sudlow, Gerald Sze, Nicholas L Timpson, Tapiwa Tungamirai, Laura Venn, Matthew Walker, Neil Walker, Nicolas Wareham, Aidan Watmuff, Tony Webb, Karen Williams, John Wright, Darioush Yarand, George B Ploubidis, John Macleod, Jonathan Ac Sterne, Nishi Chaturvedi","doi":"10.23889/ijpds.v10i1.2468","DOIUrl":"10.23889/ijpds.v10i1.2468","url":null,"abstract":"<p><strong>Introduction: </strong>The UK Longitudinal Linkage Collaboration (UK LLC) is the national Trusted Research Environment (TRE) for the UK's longitudinal research community, supporting the UK's unparalleled collection of Longitudinal Population Studies (LPS). Initially set up as a COVID-19 research resource, UK LLC is now a generic database for any research for the public good.</p><p><strong>Objectives: </strong>UK LLC supports longitudinal research by providing record linkage and TRE services.</p><p><strong>Methods: </strong>The UK LLC partnership provides a secure analytics environment, a trusted third-party linkage processor and a comprehensive governance framework to minimise risks to participant confidentiality. UK LLC is ISO 27001 certified and accredited by the UK Statistics Authority as a processor under the Digital Economy Act. The active involvement by members of UK LLC's public involvement programme ensures UK LLC is acceptable to LPS participants and the wider public. All UK LPS are eligible for inclusion. Researchers can apply to access the TRE via an approach that fulfils the needs of the LPS, the linked data owners and includes a review by public contributors.</p><p><strong>Results: </strong>Twenty-two LPS have so far joined UK LLC. Where permissions allow, participants are linked to their National Health Service (NHS) England, NHS Wales and place-based records, with work ongoing to link to NHS Scotland and non-health administrative records, including Department for Work and Pensions and His Majesty's (HM) Revenue and Customs. UK LLC Explore allows potential researchers to discover the breadth of data available in the TRE. All applications are listed on UK LLC's publicly accessible Data Access Register.</p><p><strong>Conclusions: </strong>UK LLC enables researchers to interrogate pooled LPS participant data that are systematically linked to diverse records. UK LLC remains open to additional LPS joining the partnership and will increase the breadth of data to support the longitudinal research community and attract increasing numbers of researchers across multiple disciplines, government departments and industry.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2468"},"PeriodicalIF":1.6,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11931487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143701726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Selin Akaraci, Alison Macfarlane, Amal Rammah, Emilie Courtin, Esther Lewis, Faith Miller, Jason Powell-Bavester, Jessica Mitchell, Joana Cruz, Matthew Lilliman, Niloofar Shoari, Samantha Hajna, Steven Cummins, Tolu Adedire, Vahe Nafilyan, Pia Hardelid
{"title":"Kids' Environment and Health Cohort: Database Protocol: supplementary appendix.","authors":"Selin Akaraci, Alison Macfarlane, Amal Rammah, Emilie Courtin, Esther Lewis, Faith Miller, Jason Powell-Bavester, Jessica Mitchell, Joana Cruz, Matthew Lilliman, Niloofar Shoari, Samantha Hajna, Steven Cummins, Tolu Adedire, Vahe Nafilyan, Pia Hardelid","doi":"10.23889/ijpds.v10i1.2475","DOIUrl":"10.23889/ijpds.v10i1.2475","url":null,"abstract":"<p><strong>Introduction: </strong>Environmental exposures are known to affect the health and well-being of populations throughout the life course. Children are particularly susceptible to environmental impacts on educational and health outcomes as they spend more time in their local environments compared to adults. In England, no national, longitudinal dataset linking information about the physical and social environment in and around homes and schools to children's health and education outcomes currently exists. This limits our understanding of how environments might impact the health and well-being of children as they grow up.</p><p><strong>Objective: </strong>To establish the Kids' Environment and Health Cohort, a research-ready, de-identified and annually updated national birth cohort of all children born in England from 2006 onwards.</p><p><strong>Methods: </strong>The Kids' Environment and Health Cohort will link birth and mortality records, health and educational attainment datasets, to maternal health (up to 12 months prior to their child's birth), and environmental data for all children born in England from 2006 - approximately 11 million children at first build. A subset of children born between 2010 and 2012, and between 2020 and 2022 will be linked to their mothers' 2011 or 2021 Census records, respectively. The cohort database will be held in, and accessed via, a trusted research environment (TRE) at the Office for National Statistics (ONS). All geographical identifiers in the cohort, allowing for linkage to further environmental data, will be securely held by the ONS, separately to the main cohort, and will be encrypted before being shared with researchers.</p><p><strong>Conclusion: </strong>The Kids' Environment and Health Cohort will, for the first time, link administrative health and education data to longitudinal environmental exposures for children at national level in England. It will serve as a data resource to support research about the health and well-being of children via improved home and school environments.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2475"},"PeriodicalIF":1.6,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11878347/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}