Takamasa Sakai, Hedvig Nordeng, Marleen M H J van Gelder
{"title":"Longitudinal Methods Versus Multiple Imputation to Infer Missing Maternal Data in Registry-Based Pregnancy Studies.","authors":"Takamasa Sakai, Hedvig Nordeng, Marleen M H J van Gelder","doi":"10.1111/ppe.70011","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In birth registries, incomplete recording of information leads to missing values. Multiple imputation (MI) by chained equations is a widely used method for analysing datasets with missing data. It is unknown whether using registry records from multiple pregnancies contributed by the same woman could potentially give more accurate values when resolving missing data.</p><p><strong>Objectives: </strong>To investigate the relative performance of five methods to infer missing data on maternal characteristics using data from a medical birth registry, comparing longitudinal methods and MI with data from previous and future pregnancies.</p><p><strong>Methods: </strong>We used data from the Medical Birth Registry of Norway (MBRN), selecting records among mothers with more than one pregnancy between 2004 and 2018. Longitudinal methods used reference pregnancies in three time directions: past, future and closest pregnancy record. MI was conducted with only index pregnancy records (single-pregnancy MI) and with both index and closest reference pregnancy records (multiple-pregnancy MI). Validity was assessed by comparing the actual values with inferred/imputed values. For continuous variables, we calculated the proportion of inferred values within predefined increments. For binary variables, we calculated five parameters: agreement rate, sensitivity, specificity, positive predictive value and negative predictive value.</p><p><strong>Results: </strong>We included 578,670 pregnancies among 256,658 women. For continuous variables, the longitudinal methods showed the highest proportion within predefined increments, followed by multiple-pregnancy MI, and single-pregnancy MI showed the lowest value. For binary variables, longitudinal methods generally showed higher values among the five validity parameters than MI. Single-pregnancy MI had substantially lower agreement, while multiple-pregnancy MI performed similarly to longitudinal methods.</p><p><strong>Conclusions: </strong>The longitudinal method outperformed MI in inferring missing data on maternal characteristics in a medical birth registry.</p>","PeriodicalId":19698,"journal":{"name":"Paediatric and perinatal epidemiology","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Paediatric and perinatal epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/ppe.70011","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: In birth registries, incomplete recording of information leads to missing values. Multiple imputation (MI) by chained equations is a widely used method for analysing datasets with missing data. It is unknown whether using registry records from multiple pregnancies contributed by the same woman could potentially give more accurate values when resolving missing data.
Objectives: To investigate the relative performance of five methods to infer missing data on maternal characteristics using data from a medical birth registry, comparing longitudinal methods and MI with data from previous and future pregnancies.
Methods: We used data from the Medical Birth Registry of Norway (MBRN), selecting records among mothers with more than one pregnancy between 2004 and 2018. Longitudinal methods used reference pregnancies in three time directions: past, future and closest pregnancy record. MI was conducted with only index pregnancy records (single-pregnancy MI) and with both index and closest reference pregnancy records (multiple-pregnancy MI). Validity was assessed by comparing the actual values with inferred/imputed values. For continuous variables, we calculated the proportion of inferred values within predefined increments. For binary variables, we calculated five parameters: agreement rate, sensitivity, specificity, positive predictive value and negative predictive value.
Results: We included 578,670 pregnancies among 256,658 women. For continuous variables, the longitudinal methods showed the highest proportion within predefined increments, followed by multiple-pregnancy MI, and single-pregnancy MI showed the lowest value. For binary variables, longitudinal methods generally showed higher values among the five validity parameters than MI. Single-pregnancy MI had substantially lower agreement, while multiple-pregnancy MI performed similarly to longitudinal methods.
Conclusions: The longitudinal method outperformed MI in inferring missing data on maternal characteristics in a medical birth registry.
期刊介绍:
Paediatric and Perinatal Epidemiology crosses the boundaries between the epidemiologist and the paediatrician, obstetrician or specialist in child health, ensuring that important paediatric and perinatal studies reach those clinicians for whom the results are especially relevant. In addition to original research articles, the Journal also includes commentaries, book reviews and annotations.