{"title":"Probabilistically Linkage of California's Birth Certificate and Hospital Discharge Data.","authors":"Bharti Garg, Aaron B Caughey, Blair G Darney","doi":"10.1097/MLR.0000000000002139","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To link California's birth certificate data with maternal and infant hospital discharge data to get a valuable database for epidemiological research.</p><p><strong>Background: </strong>Secondary data sources are widely used for epidemiological research. Although California's birth certificate and patient discharge data (PDD) are readily available separately, the linked data are only available till 2012. We obtained birth certificate data from the California Department of Public Health and hospital discharge data from the Department of Health Care Access and Information. In this study, we propose a methodology to link these 2 datasets, which can be used for perinatal epidemiological research. We utilized data from 2008 to 2019.</p><p><strong>Methods: </strong>We used probabilistic linkage methods to link birth certificates and hospital discharge data. Hospital discharge data was included as 2 datasets: maternal and infant discharge records. The linkage was a 2-step process: (1) Linkage of birth certificate with infant's hospital discharge data to form combined data. (2) Linkage of combined birth certificate-infant's discharge data with maternal discharge data.</p><p><strong>Results: </strong>We included 5,661,695 births from birth certificates and 5,617,921 infant discharge files. After linkage, we were able to link 92.2% of the birth certificate records with the infant's discharge files using variables: maternity hospital, infant's birth date, infant's sex, mother's residence zip code, and birth Hospital County. When the combined vital statistics-infant's PDD data were linked with maternal PDD data, 90.0% of vital statistics data linked with both infant and maternal PDD, 2.5% linked to only infant PDD, and 1.5% linked to only maternal PDD.</p><p><strong>Conclusion: </strong>Our linkage algorithm produces effective linked data that can be used for epidemiological research. This process is complex and needs to be evaluated every year as some of the variables change, or some added information becomes available in some files.</p>","PeriodicalId":18364,"journal":{"name":"Medical Care","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/MLR.0000000000002139","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Probabilistically Linkage of California's Birth Certificate and Hospital Discharge Data.
Objective: To link California's birth certificate data with maternal and infant hospital discharge data to get a valuable database for epidemiological research.
Background: Secondary data sources are widely used for epidemiological research. Although California's birth certificate and patient discharge data (PDD) are readily available separately, the linked data are only available till 2012. We obtained birth certificate data from the California Department of Public Health and hospital discharge data from the Department of Health Care Access and Information. In this study, we propose a methodology to link these 2 datasets, which can be used for perinatal epidemiological research. We utilized data from 2008 to 2019.
Methods: We used probabilistic linkage methods to link birth certificates and hospital discharge data. Hospital discharge data was included as 2 datasets: maternal and infant discharge records. The linkage was a 2-step process: (1) Linkage of birth certificate with infant's hospital discharge data to form combined data. (2) Linkage of combined birth certificate-infant's discharge data with maternal discharge data.
Results: We included 5,661,695 births from birth certificates and 5,617,921 infant discharge files. After linkage, we were able to link 92.2% of the birth certificate records with the infant's discharge files using variables: maternity hospital, infant's birth date, infant's sex, mother's residence zip code, and birth Hospital County. When the combined vital statistics-infant's PDD data were linked with maternal PDD data, 90.0% of vital statistics data linked with both infant and maternal PDD, 2.5% linked to only infant PDD, and 1.5% linked to only maternal PDD.
Conclusion: Our linkage algorithm produces effective linked data that can be used for epidemiological research. This process is complex and needs to be evaluated every year as some of the variables change, or some added information becomes available in some files.
期刊介绍:
Rated as one of the top ten journals in healthcare administration, Medical Care is devoted to all aspects of the administration and delivery of healthcare. This scholarly journal publishes original, peer-reviewed papers documenting the most current developments in the rapidly changing field of healthcare. This timely journal reports on the findings of original investigations into issues related to the research, planning, organization, financing, provision, and evaluation of health services.