{"title":"具有检出限的协变量半连续生存模型的最大似然估计。","authors":"Paul W Bernhardt","doi":"10.1515/ijb-2017-0058","DOIUrl":null,"url":null,"abstract":"<p><p>Semicontinuous data are common in biological studies, occurring when a variable is continuous over a region but has a point mass at one or more points. In the motivating Genetic and Inflammatory Markers of Sepsis (GenIMS) study, it was of interest to determine how several biomarkers subject to detection limits were related to survival for patients entering the hospital with community acquired pneumonia. While survival times were recorded for all individuals in the study, the primary endpoint of interest was the binary event of 90-day survival, and no patients were lost to follow-up prior to 90 days. In order to use all of the available survival information, we propose a two-part regression model where the probability of surviving to 90 days is modeled using logistic regression and the survival distribution for those experiencing the event prior to this time is modeled with a truncated accelerated failure time model. We assume a series of mixture of normal regression models to model the joint distribution of the censored biomarkers. To estimate the parameters in this model, we suggest a Monte Carlo EM algorithm where multiple imputations are generated for the censored covariates in order to estimate the expectation in the E-step and then weighted maximization is applied to the observed and imputed data in the M-step. We conduct simulations to assess the proposed model and maximization method, and we analyze the GenIMS data set.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"14 2","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2018-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2017-0058","citationCount":"4","resultStr":"{\"title\":\"Maximum Likelihood Estimation in a Semicontinuous Survival Model with Covariates Subject to Detection Limits.\",\"authors\":\"Paul W Bernhardt\",\"doi\":\"10.1515/ijb-2017-0058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Semicontinuous data are common in biological studies, occurring when a variable is continuous over a region but has a point mass at one or more points. In the motivating Genetic and Inflammatory Markers of Sepsis (GenIMS) study, it was of interest to determine how several biomarkers subject to detection limits were related to survival for patients entering the hospital with community acquired pneumonia. While survival times were recorded for all individuals in the study, the primary endpoint of interest was the binary event of 90-day survival, and no patients were lost to follow-up prior to 90 days. In order to use all of the available survival information, we propose a two-part regression model where the probability of surviving to 90 days is modeled using logistic regression and the survival distribution for those experiencing the event prior to this time is modeled with a truncated accelerated failure time model. We assume a series of mixture of normal regression models to model the joint distribution of the censored biomarkers. To estimate the parameters in this model, we suggest a Monte Carlo EM algorithm where multiple imputations are generated for the censored covariates in order to estimate the expectation in the E-step and then weighted maximization is applied to the observed and imputed data in the M-step. We conduct simulations to assess the proposed model and maximization method, and we analyze the GenIMS data set.</p>\",\"PeriodicalId\":49058,\"journal\":{\"name\":\"International Journal of Biostatistics\",\"volume\":\"14 2\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2018-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1515/ijb-2017-0058\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Biostatistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1515/ijb-2017-0058\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biostatistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/ijb-2017-0058","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Maximum Likelihood Estimation in a Semicontinuous Survival Model with Covariates Subject to Detection Limits.
Semicontinuous data are common in biological studies, occurring when a variable is continuous over a region but has a point mass at one or more points. In the motivating Genetic and Inflammatory Markers of Sepsis (GenIMS) study, it was of interest to determine how several biomarkers subject to detection limits were related to survival for patients entering the hospital with community acquired pneumonia. While survival times were recorded for all individuals in the study, the primary endpoint of interest was the binary event of 90-day survival, and no patients were lost to follow-up prior to 90 days. In order to use all of the available survival information, we propose a two-part regression model where the probability of surviving to 90 days is modeled using logistic regression and the survival distribution for those experiencing the event prior to this time is modeled with a truncated accelerated failure time model. We assume a series of mixture of normal regression models to model the joint distribution of the censored biomarkers. To estimate the parameters in this model, we suggest a Monte Carlo EM algorithm where multiple imputations are generated for the censored covariates in order to estimate the expectation in the E-step and then weighted maximization is applied to the observed and imputed data in the M-step. We conduct simulations to assess the proposed model and maximization method, and we analyze the GenIMS data set.
期刊介绍:
The International Journal of Biostatistics (IJB) seeks to publish new biostatistical models and methods, new statistical theory, as well as original applications of statistical methods, for important practical problems arising from the biological, medical, public health, and agricultural sciences with an emphasis on semiparametric methods. Given many alternatives to publish exist within biostatistics, IJB offers a place to publish for research in biostatistics focusing on modern methods, often based on machine-learning and other data-adaptive methodologies, as well as providing a unique reading experience that compels the author to be explicit about the statistical inference problem addressed by the paper. IJB is intended that the journal cover the entire range of biostatistics, from theoretical advances to relevant and sensible translations of a practical problem into a statistical framework. Electronic publication also allows for data and software code to be appended, and opens the door for reproducible research allowing readers to easily replicate analyses described in a paper. Both original research and review articles will be warmly received, as will articles applying sound statistical methods to practical problems.