预防研究中缺失数据分析

J. Graham, S. Hofer, A. Piccinin
{"title":"预防研究中缺失数据分析","authors":"J. Graham, S. Hofer, A. Piccinin","doi":"10.1037/10222-010","DOIUrl":null,"url":null,"abstract":"Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.","PeriodicalId":76229,"journal":{"name":"NIDA research monograph","volume":"142 1","pages":"13-63"},"PeriodicalIF":0.0000,"publicationDate":"1997-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"311","resultStr":"{\"title\":\"Analysis With Missing Data in Prevention Research\",\"authors\":\"J. Graham, S. Hofer, A. Piccinin\",\"doi\":\"10.1037/10222-010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.\",\"PeriodicalId\":76229,\"journal\":{\"name\":\"NIDA research monograph\",\"volume\":\"142 1\",\"pages\":\"13-63\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"311\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NIDA research monograph\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1037/10222-010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NIDA research monograph","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1037/10222-010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 311

摘要

多年来,数据缺失问题一直是预防研究人员的眼中钉。虽然在统计文献中有一些解决这些问题的办法,但这些办法还没有进入主流预防研究。本章旨在介绍最近Little和Rubin(1987)等人提出的缺失数据分析解决方案的系统应用。本章没有描述一个完整的策略,但它与以下内容相关:(1)使用连续(但不是分类)数据的缺失数据分析,(2)合理正态分布的数据,以及(3)与一般线性模型相关的分析中缺失数据问题的解决方案,特别是使用(或可以使用)协方差矩阵作为输入的分析。本章的例子来自药物预防研究。本章讨论了(1)想要向受访者提出的问题多于大多数人能回答的问题;(2)人员磨耗问题及解决办法;(3)特殊测量程序过于昂贵或耗时,无法获得所有受试者。作者最后得出了几个结论:只要有可能,研究人员应该使用期望最大化(EM)算法(或其他最大似然程序,包括多组结构方程建模程序,或在适当情况下,对涉及缺失数据的分析进行多重输入[本章提供了具体示例]);如果研究人员必须使用其他分析方法,他们应该记住,这些方法会产生有偏见的结果,不应该依赖于最终分析;当数据丢失时,适当的丢失数据分析程序不会无中生有,而是最大限度地利用现有数据;当数据缺失时,研究者应该努力寻找缺失的原因(特别是在计划研究时),并将原因纳入分析模型;研究人员应该对最初缺失的案例进行抽样(只要有可能),并相应地调整EM算法的参数估计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Analysis With Missing Data in Prevention Research
Missing data problems have been a thorn in the side of prevention researchers for years. Although some solutions for these problems have been available in the statistical literature, these solutions have not found their way into mainstream prevention research. This chapter is meant to serve as an introduction to the systematic application of the missing data analysis solutions presented recently by Little and Rubin (1987) and others. The chapter does not describe a complete strategy, but it is relevant for (1) missing data analysis with continuous (but not categorical) data, (2) data that are reasonably normally distributed, and (3) solutions for missing data problems for analyses related to the general linear model in particular, analyses that use (or can use) a covariance matrix as input. The examples in the chapter come from drug prevention research. The chapter discusses (1) the problem of wanting to ask respondents more questions than most individuals can answer; (2) the problem of attrition and some solutions; and (3) the problem of special measurement procedures that are too expensive or time consuming to obtain for all subjects. The authors end with several conclusions: Whenever possible, researchers should use the Expectation-Maximization (EM) algorithm (or other maximum likelihood procedure, including the multiple-group structural equation-modeling procedure or, where appropriate, multiple imputation, for analyses involving missing data [the chapter provides concrete examples]); If researchers must use other analyses, they should keep in mind that these others produce biased results and should not be relied upon for final analyses; When data are missing, the appropriate missing data analysis procedures do not generate something out of nothing but do make the most out of the data available; When data are missing, researchers should work hard (especially when planning a study) to find the cause of missingness and include the cause in the analysis models; and Researchers should sample the cases originally missing (whenever possible) and adjust EM algorithm parameter estimates accordingly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信