Di He, Xiwen Huang, Onyebuchi A Arah, Douglas I Walker, Dean P Jones, Beate Ritz, Julia E Heck
{"title":"A prediction model for classifying maternal pregnancy smoking using California state birth certificate information.","authors":"Di He, Xiwen Huang, Onyebuchi A Arah, Douglas I Walker, Dean P Jones, Beate Ritz, Julia E Heck","doi":"10.1111/ppe.13021","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Systematically recorded smoking data are not always available in vital statistics records, and even when available it can underestimate true smoking rates.</p><p><strong>Objective: </strong>To develop a prediction model for maternal tobacco smoking in late pregnancy based on birth certificate information using a combination of self- or provider-reported smoking and biomarkers (smoking metabolites) in neonatal blood spots as the alloyed gold standard.</p><p><strong>Methods: </strong>We designed a case-control study where childhood cancer cases were identified from the California Cancer Registry and controls were from the California birth rolls between 1983 and 2011 who were cancer-free by the age of six. In this analysis, we included 894 control participants and performed high-resolution metabolomics analyses in their neonatal dried blood spots, where we extracted cotinine [mass-to-charge ratio (m/z) = 177.1023] and hydroxycotinine (m/z = 193.0973). Potential predictors of smoking were selected from California birth certificates. Logistic regression with stepwise backward selection was used to build a prediction model. Model performance was evaluated in a training sample, a bootstrapped sample, and an external validation sample.</p><p><strong>Results: </strong>Out of seven predictor variables entered into the logistic model, five were selected by the stepwise procedure: maternal race/ethnicity, maternal education, child's birth year, parity, and child's birth weight. We calculated an overall discrimination accuracy of 0.72 and an area under the receiver operating characteristic curve (AUC) of 0.81 (95% confidence interval [CI] 0.77, 0.84) in the training set. Similar accuracies were achieved in the internal (AUC 0.81, 95% CI 0.77, 0.84) and external (AUC 0.69, 95% CI 0.64, 0.74) validation sets.</p><p><strong>Conclusions: </strong>This easy-to-apply model may benefit future birth registry-based studies when there is missing maternal smoking information; however, some smoking status misclassification remains a concern when only variables from the birth certificate are used to predict maternal smoking.</p>","PeriodicalId":19698,"journal":{"name":"Paediatric and perinatal epidemiology","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10922711/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Paediatric and perinatal epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/ppe.13021","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/11/15 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Systematically recorded smoking data are not always available in vital statistics records, and even when available it can underestimate true smoking rates.
Objective: To develop a prediction model for maternal tobacco smoking in late pregnancy based on birth certificate information using a combination of self- or provider-reported smoking and biomarkers (smoking metabolites) in neonatal blood spots as the alloyed gold standard.
Methods: We designed a case-control study where childhood cancer cases were identified from the California Cancer Registry and controls were from the California birth rolls between 1983 and 2011 who were cancer-free by the age of six. In this analysis, we included 894 control participants and performed high-resolution metabolomics analyses in their neonatal dried blood spots, where we extracted cotinine [mass-to-charge ratio (m/z) = 177.1023] and hydroxycotinine (m/z = 193.0973). Potential predictors of smoking were selected from California birth certificates. Logistic regression with stepwise backward selection was used to build a prediction model. Model performance was evaluated in a training sample, a bootstrapped sample, and an external validation sample.
Results: Out of seven predictor variables entered into the logistic model, five were selected by the stepwise procedure: maternal race/ethnicity, maternal education, child's birth year, parity, and child's birth weight. We calculated an overall discrimination accuracy of 0.72 and an area under the receiver operating characteristic curve (AUC) of 0.81 (95% confidence interval [CI] 0.77, 0.84) in the training set. Similar accuracies were achieved in the internal (AUC 0.81, 95% CI 0.77, 0.84) and external (AUC 0.69, 95% CI 0.64, 0.74) validation sets.
Conclusions: This easy-to-apply model may benefit future birth registry-based studies when there is missing maternal smoking information; however, some smoking status misclassification remains a concern when only variables from the birth certificate are used to predict maternal smoking.
期刊介绍:
Paediatric and Perinatal Epidemiology crosses the boundaries between the epidemiologist and the paediatrician, obstetrician or specialist in child health, ensuring that important paediatric and perinatal studies reach those clinicians for whom the results are especially relevant. In addition to original research articles, the Journal also includes commentaries, book reviews and annotations.