在评估相互作用效应的分子流行病学研究中使用完整病例和基于多个假设的分析。

Epidemiologic perspectives & innovations : EP+I Pub Date : 2011-10-06 DOI:10.1186/1742-5573-8-5

Manisha Desai, Denise A Esserman, Marilie D Gammon, Mary B Terry

{"title":"在评估相互作用效应的分子流行病学研究中使用完整病例和基于多个假设的分析。","authors":"Manisha Desai, Denise A Esserman, Marilie D Gammon, Mary B Terry","doi":"10.1186/1742-5573-8-5","DOIUrl":null,"url":null,"abstract":"Background: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates.Methods: Through simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist.Results: CC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach.Conclusions: Our findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions.","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"8 1","pages":"5"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-8-5","citationCount":"27","resultStr":"{\"title\":\"The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects.\",\"authors\":\"Manisha Desai, Denise A Esserman, Marilie D Gammon, Mary B Terry\",\"doi\":\"10.1186/1742-5573-8-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates.Methods: Through simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist.Results: CC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach.Conclusions: Our findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions.\",\"PeriodicalId\":87082,\"journal\":{\"name\":\"Epidemiologic perspectives & innovations : EP+I\",\"volume\":\"8 1\",\"pages\":\"5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/1742-5573-8-5\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epidemiologic perspectives & innovations : EP+I\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/1742-5573-8-5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic perspectives & innovations : EP+I","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/1742-5573-8-5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 27

摘要

背景:在分子流行病学研究中，生物标本数据的收集通常是为了评估生物标志物和另一特征对结果的协同作用。通常，生物标志物数据只收集了一部分符合研究条件的受试者，导致数据缺失问题。然而，缺少数据的方法通常不被纳入分析。相反，执行的是完整案例(CC)分析，这可能导致有偏差和低效的估计。方法:通过仿真，表征了CC方法在估计交互效应时的性能。我们还研究了在数据不随机缺失(NMAR)和辅助信息可能存在或不存在的情况下，标准多重插值(MI)是否可以改善CC方法的估计。结果:CC分析显示有相当大的偏倚和效率损失。虽然MI在特定条件下比CC方法减少了偏差并提高了效率，但它也导致了有偏差的估计，这取决于可用辅助数据的强度和缺失的性质。特别是，当协变量的极值更有可能缺失时，CC的表现优于MI，而当协变量的缺失与协变量和结果都相关时，MI的表现优于CC。当强辅助数据可用时，MI总能提高性能。在一项真实的研究中，相对于CC方法，MI对相互作用效应的估计被减弱了。结论:我们的研究结果表明，将缺失数据方法纳入分析的重要性。如果数据是MAR，则标准MI是一种合理的方法。即使数据是NMAR，辅助变量也可能使这种假设更合理。在NMAR下，我们强调使用标准MI时要谨慎，只有在有强大的辅助数据时才推荐使用它而不是CC。MI指定了缺失的数据机制，是数据为NMAR时的备选方案。在所有情况下，建议利用MI的能力来解释这些假设的不确定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects.

Background: In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates.

Methods: Through simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist.

Results: CC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach.

Conclusions: Our findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Epidemiologic perspectives & innovations : EP+I

自引率

0.00%

发文量