Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong
{"title":"利用常规收集的数据预测人群中孕妇的脆弱性,以及自我报告数据的附加意义。","authors":"Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong","doi":"10.1093/eurpub/ckae184","DOIUrl":null,"url":null,"abstract":"<p><p>Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.</p>","PeriodicalId":12059,"journal":{"name":"European Journal of Public Health","volume":" ","pages":"1210-1217"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631480/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting population-level vulnerability among pregnant women using routinely collected data and the added relevance of self-reported data.\",\"authors\":\"Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong\",\"doi\":\"10.1093/eurpub/ckae184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.</p>\",\"PeriodicalId\":12059,\"journal\":{\"name\":\"European Journal of Public Health\",\"volume\":\" \",\"pages\":\"1210-1217\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631480/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Public Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/eurpub/ckae184\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/eurpub/ckae184","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
Predicting population-level vulnerability among pregnant women using routinely collected data and the added relevance of self-reported data.
Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.
期刊介绍:
The European Journal of Public Health (EJPH) is a multidisciplinary journal aimed at attracting contributions from epidemiology, health services research, health economics, social sciences, management sciences, ethics and law, environmental health sciences, and other disciplines of relevance to public health. The journal provides a forum for discussion and debate of current international public health issues, with a focus on the European Region. Bi-monthly issues contain peer-reviewed original articles, editorials, commentaries, book reviews, news, letters to the editor, announcements of events, and various other features.