解决在英语CPRD GOLD妊娠登记中确定妊娠的不确定性:使用工作实例的方法学研究。

IF 2.2 Q3 HEALTH CARE SCIENCES & SERVICES
International Journal of Population Data Science Pub Date : 2025-02-25 eCollection Date: 2025-01-01 DOI:10.23889/ijpds.v10i1.2471
Yangmei Li, Jennifer J Kurinczuk, Fiona Alderdice, Maria A Quigley, Oliver Rivero-Arias, Julia Sanders, Sara Kenyon, Dimitrios Siassakos, Nikesh Parekh, Suresha De Almeida, Claire Carson
{"title":"解决在英语CPRD GOLD妊娠登记中确定妊娠的不确定性:使用工作实例的方法学研究。","authors":"Yangmei Li, Jennifer J Kurinczuk, Fiona Alderdice, Maria A Quigley, Oliver Rivero-Arias, Julia Sanders, Sara Kenyon, Dimitrios Siassakos, Nikesh Parekh, Suresha De Almeida, Claire Carson","doi":"10.23889/ijpds.v10i1.2471","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Electronic health records are invaluable for pregnancy-related studies. The Clinical Practice Research Datalink (CPRD) Pregnancy Register (PR) identifies pregnancies in primary care records, including uncertain cases.</p><p><strong>Objectives: </strong>This paper outlines a method to reduce uncertainty in identifying pregnancies within CPRD GOLD PR data, exemplified through a study investigating the provision of pre-pregnancy care.</p><p><strong>Methods: </strong>We used CPRD Mother Baby Link (MBL) and Maternity Hospital Episode Statistics (HES) to clean and augment the CPRD PR data. The study included all women aged 18-48yrs, registered at an English GP practice within CPRD on 01/01/2017, with a year of prior registration and eligibility for hospital data linkage. We developed a cleaning and combining algorithm and further applied strict data quality criteria to form three populations: 'as provided', 'derived' (using our algorithm) and 'strictly derived' (with stricter data quality criteria). We compared characteristics and outcomes across these populations, examining potential biases in effect estimates using the 'as provided' population.</p><p><strong>Results: </strong>Our algorithm added 22,270 (~7%) pregnancies from hospital data to the CPRD PR (1997-2021), eliminated conflicting pregnancies and pregnancies with unknown outcomes, and minimised potentially non-contemporaneous records of past pregnancies or partial records of pregnancies.For all pregnancies across women's reproductive history, in the 'strictly derived' population, characterised by better data quality, a higher prevalence of pre-existing medical conditions and increased pre-pregnancy care were observed. In this dataset, recording of both exposure and outcome was better, and the magnitude of the association between exposure and outcome was reduced compared to the 'as provided' population.</p><p><strong>Conclusion: </strong>PR data requires cleaning before use. This study presents a pragmatic and practical method to identify pregnancies using existing CPRD data and linked records, without needing additional data. Researchers should carefully consider their studies' specific requirements and may adapt our proposed methodology accordingly to align with their research questions.</p>","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":"10 1","pages":"2471"},"PeriodicalIF":2.2000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11874892/pdf/","citationCount":"0","resultStr":"{\"title\":\"Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.\",\"authors\":\"Yangmei Li, Jennifer J Kurinczuk, Fiona Alderdice, Maria A Quigley, Oliver Rivero-Arias, Julia Sanders, Sara Kenyon, Dimitrios Siassakos, Nikesh Parekh, Suresha De Almeida, Claire Carson\",\"doi\":\"10.23889/ijpds.v10i1.2471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Electronic health records are invaluable for pregnancy-related studies. The Clinical Practice Research Datalink (CPRD) Pregnancy Register (PR) identifies pregnancies in primary care records, including uncertain cases.</p><p><strong>Objectives: </strong>This paper outlines a method to reduce uncertainty in identifying pregnancies within CPRD GOLD PR data, exemplified through a study investigating the provision of pre-pregnancy care.</p><p><strong>Methods: </strong>We used CPRD Mother Baby Link (MBL) and Maternity Hospital Episode Statistics (HES) to clean and augment the CPRD PR data. The study included all women aged 18-48yrs, registered at an English GP practice within CPRD on 01/01/2017, with a year of prior registration and eligibility for hospital data linkage. We developed a cleaning and combining algorithm and further applied strict data quality criteria to form three populations: 'as provided', 'derived' (using our algorithm) and 'strictly derived' (with stricter data quality criteria). We compared characteristics and outcomes across these populations, examining potential biases in effect estimates using the 'as provided' population.</p><p><strong>Results: </strong>Our algorithm added 22,270 (~7%) pregnancies from hospital data to the CPRD PR (1997-2021), eliminated conflicting pregnancies and pregnancies with unknown outcomes, and minimised potentially non-contemporaneous records of past pregnancies or partial records of pregnancies.For all pregnancies across women's reproductive history, in the 'strictly derived' population, characterised by better data quality, a higher prevalence of pre-existing medical conditions and increased pre-pregnancy care were observed. In this dataset, recording of both exposure and outcome was better, and the magnitude of the association between exposure and outcome was reduced compared to the 'as provided' population.</p><p><strong>Conclusion: </strong>PR data requires cleaning before use. This study presents a pragmatic and practical method to identify pregnancies using existing CPRD data and linked records, without needing additional data. Researchers should carefully consider their studies' specific requirements and may adapt our proposed methodology accordingly to align with their research questions.</p>\",\"PeriodicalId\":36483,\"journal\":{\"name\":\"International Journal of Population Data Science\",\"volume\":\"10 1\",\"pages\":\"2471\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11874892/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Population Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23889/ijpds.v10i1.2471\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v10i1.2471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

电子健康记录对于妊娠相关研究是无价的。临床实践研究数据链(CPRD)妊娠登记(PR)识别初级保健记录中的妊娠,包括不确定病例。目的:本文概述了一种方法,以减少在CPRD GOLD PR数据中识别怀孕的不确定性,通过一项研究调查孕前护理的提供为例。方法:采用母婴链接(MBL)和妇产医院事件统计(HES)对CPRD PR数据进行整理和扩充。该研究包括所有年龄在18-48岁的女性,于2017年1月1日在CPRD的一家英国全科医生诊所注册,提前一年注册并有资格获得医院数据链接。我们开发了一种清理和组合算法,并进一步应用严格的数据质量标准,形成了三个群体:“提供”、“派生”(使用我们的算法)和“严格派生”(使用更严格的数据质量标准)。我们比较了这些人群的特征和结果,检查了使用“提供”人群进行效果估计的潜在偏差。结果:我们的算法将医院数据中的22,270例(约7%)妊娠添加到CPRD PR(1997-2021)中,消除了冲突妊娠和结局未知的妊娠,并最大限度地减少了过去妊娠的潜在非同期记录或部分妊娠记录。对于妇女生殖史上的所有怀孕,在数据质量较好的“严格派生”人口中,观察到已有疾病的患病率较高,孕前护理增加。在这个数据集中,暴露和结果的记录更好,与“提供”的人群相比,暴露和结果之间的关联程度降低了。结论:PR资料使用前需要清洗。本研究提出了一种实用的方法,利用现有的CPRD数据和相关记录来识别怀孕,而不需要额外的数据。研究人员应该仔细考虑他们的研究的具体要求,并可能相应地调整我们提出的方法,以配合他们的研究问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.

Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.

Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.

Addressing uncertainty in identifying pregnancies in the English CPRD GOLD Pregnancy Register: a methodological study using a worked example.

Introduction: Electronic health records are invaluable for pregnancy-related studies. The Clinical Practice Research Datalink (CPRD) Pregnancy Register (PR) identifies pregnancies in primary care records, including uncertain cases.

Objectives: This paper outlines a method to reduce uncertainty in identifying pregnancies within CPRD GOLD PR data, exemplified through a study investigating the provision of pre-pregnancy care.

Methods: We used CPRD Mother Baby Link (MBL) and Maternity Hospital Episode Statistics (HES) to clean and augment the CPRD PR data. The study included all women aged 18-48yrs, registered at an English GP practice within CPRD on 01/01/2017, with a year of prior registration and eligibility for hospital data linkage. We developed a cleaning and combining algorithm and further applied strict data quality criteria to form three populations: 'as provided', 'derived' (using our algorithm) and 'strictly derived' (with stricter data quality criteria). We compared characteristics and outcomes across these populations, examining potential biases in effect estimates using the 'as provided' population.

Results: Our algorithm added 22,270 (~7%) pregnancies from hospital data to the CPRD PR (1997-2021), eliminated conflicting pregnancies and pregnancies with unknown outcomes, and minimised potentially non-contemporaneous records of past pregnancies or partial records of pregnancies.For all pregnancies across women's reproductive history, in the 'strictly derived' population, characterised by better data quality, a higher prevalence of pre-existing medical conditions and increased pre-pregnancy care were observed. In this dataset, recording of both exposure and outcome was better, and the magnitude of the association between exposure and outcome was reduced compared to the 'as provided' population.

Conclusion: PR data requires cleaning before use. This study presents a pragmatic and practical method to identify pregnancies using existing CPRD data and linked records, without needing additional data. Researchers should carefully consider their studies' specific requirements and may adapt our proposed methodology accordingly to align with their research questions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
386
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信