Emily A Knapp, Amii M Kress, Ronel Ghidey, Tyler J Gorham, Brendan Galdo, Stephen A Petrill, Izzuddin M Aris, Theresa M Bastain, Carlos A Camargo, Michael A Coccia, Nicholas Cragoe, Dana Dabelea, Anne L Dunlop, Tebeb Gebretsadik, Tina Hartert, Alison E Hipwell, Christine C Johnson, Margaret R Karagas, Kaja Z LeWinn, Luis Enrique Maldonado, Cindy T McEvoy, Hooman Mirzakhani, Thomas G O'Connor, T Michael O'Shea, Zhu Wang, Rosalind J Wright, Katherine Ziegler, Yeyi Zhu, Christopher W Bartlett, Bryan Lau
{"title":"一种基于潜在特质的测量作为数据协调和缺失数据解决方案,应用于环境对儿童健康结果的影响。","authors":"Emily A Knapp, Amii M Kress, Ronel Ghidey, Tyler J Gorham, Brendan Galdo, Stephen A Petrill, Izzuddin M Aris, Theresa M Bastain, Carlos A Camargo, Michael A Coccia, Nicholas Cragoe, Dana Dabelea, Anne L Dunlop, Tebeb Gebretsadik, Tina Hartert, Alison E Hipwell, Christine C Johnson, Margaret R Karagas, Kaja Z LeWinn, Luis Enrique Maldonado, Cindy T McEvoy, Hooman Mirzakhani, Thomas G O'Connor, T Michael O'Shea, Zhu Wang, Rosalind J Wright, Katherine Ziegler, Yeyi Zhu, Christopher W Bartlett, Bryan Lau","doi":"10.1097/EDE.0000000000001832","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Collaborative research consortia provide an efficient method to increase sample size, enabling evaluation of subgroup heterogeneity and rare outcomes. In addition to missing data challenges faced by all cohort studies like nonresponse and attrition, collaborative studies have missing data due to differences in study design and measurement of the contributing studies.</p><p><strong>Methods: </strong>We extend ROSETTA, a latent variable method that creates common measures across datasets collecting the same latent constructs with only partial overlap in measures, to define a common measure of socioeconomic status (SES) across cohorts with varying indicators in the Environmental influences on Child Health Outcomes Cohort, a consortium of pregnancy and pediatric cohorts.</p><p><strong>Results: </strong>Starting with 52 indicators of prenatal SES from 39,372 participants across 53 cohorts, ROSETTA created three factors representing key domains of SES: income and education, insurance and poverty, and unemployment. At least one factor score was available for 34,528 participants and two factors were available for more participants than any single indicator. Factors fit the data well, had content validity, and were correlated with alternative measures of SES (for income and education factor, r = 0.40-0.89). Higher SES as measured by the factor scores was associated with lower odds of prenatal smoking: odds ratio income and education : 0.42 (95% confidence interval: 0.38, 0.45). Missing data were reduced compared with most methods, except for multiple imputation.</p><p><strong>Conclusion: </strong>ROSETTA aids in pooled analysis of individual participant data by creating measures on a common scale and maximizing data in the presence of missing and mismatched measures.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":" ","pages":"413-424"},"PeriodicalIF":4.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11991882/pdf/","citationCount":"0","resultStr":"{\"title\":\"A Latent Trait-based Measure as a Data Harmonization and Missing Data Solution Applied to the Environmental Influences on Child Health Outcomes Cohort.\",\"authors\":\"Emily A Knapp, Amii M Kress, Ronel Ghidey, Tyler J Gorham, Brendan Galdo, Stephen A Petrill, Izzuddin M Aris, Theresa M Bastain, Carlos A Camargo, Michael A Coccia, Nicholas Cragoe, Dana Dabelea, Anne L Dunlop, Tebeb Gebretsadik, Tina Hartert, Alison E Hipwell, Christine C Johnson, Margaret R Karagas, Kaja Z LeWinn, Luis Enrique Maldonado, Cindy T McEvoy, Hooman Mirzakhani, Thomas G O'Connor, T Michael O'Shea, Zhu Wang, Rosalind J Wright, Katherine Ziegler, Yeyi Zhu, Christopher W Bartlett, Bryan Lau\",\"doi\":\"10.1097/EDE.0000000000001832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Collaborative research consortia provide an efficient method to increase sample size, enabling evaluation of subgroup heterogeneity and rare outcomes. In addition to missing data challenges faced by all cohort studies like nonresponse and attrition, collaborative studies have missing data due to differences in study design and measurement of the contributing studies.</p><p><strong>Methods: </strong>We extend ROSETTA, a latent variable method that creates common measures across datasets collecting the same latent constructs with only partial overlap in measures, to define a common measure of socioeconomic status (SES) across cohorts with varying indicators in the Environmental influences on Child Health Outcomes Cohort, a consortium of pregnancy and pediatric cohorts.</p><p><strong>Results: </strong>Starting with 52 indicators of prenatal SES from 39,372 participants across 53 cohorts, ROSETTA created three factors representing key domains of SES: income and education, insurance and poverty, and unemployment. At least one factor score was available for 34,528 participants and two factors were available for more participants than any single indicator. Factors fit the data well, had content validity, and were correlated with alternative measures of SES (for income and education factor, r = 0.40-0.89). Higher SES as measured by the factor scores was associated with lower odds of prenatal smoking: odds ratio income and education : 0.42 (95% confidence interval: 0.38, 0.45). Missing data were reduced compared with most methods, except for multiple imputation.</p><p><strong>Conclusion: </strong>ROSETTA aids in pooled analysis of individual participant data by creating measures on a common scale and maximizing data in the presence of missing and mismatched measures.</p>\",\"PeriodicalId\":11779,\"journal\":{\"name\":\"Epidemiology\",\"volume\":\" \",\"pages\":\"413-424\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11991882/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/EDE.0000000000001832\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/EDE.0000000000001832","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
摘要
背景:合作研究联盟提供了一种有效的方法来增加样本量,使评估亚组异质性和罕见结果成为可能。除了所有队列研究都面临数据缺失的挑战,如无反应和减员,由于研究设计和贡献研究的测量差异,协作研究也存在数据缺失。方法:我们扩展了ROSETTA(一种潜在变量方法,在收集相同潜在构式的数据集中创建通用测量方法,测量方法中只有部分重叠),以定义具有不同指标的队列中社会经济地位(SES)的通用测量方法,该队列是一个由妊娠和儿科队列组成的联盟。结果:从53个队列39,372名参与者的52个产前SES指标开始,ROSETTA创建了三个代表SES关键领域的因素:收入和教育,保险和贫困以及失业。34,528名参与者至少有一个因素得分;两个因素比任何单一指标适用于更多的参与者。这些因素与数据拟合良好,具有内容效度,并与社会经济地位的替代测量相关(收入和教育因素,r= 0.40-0.89)。较高的社会经济地位与产前吸烟的几率较低相关:OR收入和教育为0.42 (95% CI 0.38, 0.45)。与大多数方法相比,该方法减少了缺失数据,但多次插入除外。结论:ROSETTA通过在共同尺度上创建测量并在存在缺失和不匹配测量的情况下最大化数据,有助于对个体参与者数据进行汇总分析。
A Latent Trait-based Measure as a Data Harmonization and Missing Data Solution Applied to the Environmental Influences on Child Health Outcomes Cohort.
Background: Collaborative research consortia provide an efficient method to increase sample size, enabling evaluation of subgroup heterogeneity and rare outcomes. In addition to missing data challenges faced by all cohort studies like nonresponse and attrition, collaborative studies have missing data due to differences in study design and measurement of the contributing studies.
Methods: We extend ROSETTA, a latent variable method that creates common measures across datasets collecting the same latent constructs with only partial overlap in measures, to define a common measure of socioeconomic status (SES) across cohorts with varying indicators in the Environmental influences on Child Health Outcomes Cohort, a consortium of pregnancy and pediatric cohorts.
Results: Starting with 52 indicators of prenatal SES from 39,372 participants across 53 cohorts, ROSETTA created three factors representing key domains of SES: income and education, insurance and poverty, and unemployment. At least one factor score was available for 34,528 participants and two factors were available for more participants than any single indicator. Factors fit the data well, had content validity, and were correlated with alternative measures of SES (for income and education factor, r = 0.40-0.89). Higher SES as measured by the factor scores was associated with lower odds of prenatal smoking: odds ratio income and education : 0.42 (95% confidence interval: 0.38, 0.45). Missing data were reduced compared with most methods, except for multiple imputation.
Conclusion: ROSETTA aids in pooled analysis of individual participant data by creating measures on a common scale and maximizing data in the presence of missing and mismatched measures.
期刊介绍:
Epidemiology publishes original research from all fields of epidemiology. The journal also welcomes review articles and meta-analyses, novel hypotheses, descriptions and applications of new methods, and discussions of research theory or public health policy. We give special consideration to papers from developing countries.