Colin Griesbach, Benjamin Säfken, Elisabeth Waldmann
{"title":"Gradient boosting for linear mixed models.","authors":"Colin Griesbach, Benjamin Säfken, Elisabeth Waldmann","doi":"10.1515/ijb-2020-0136","DOIUrl":"https://doi.org/10.1515/ijb-2020-0136","url":null,"abstract":"<p><p>Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current boosting approaches also offer methods accounting for random effects and thus enable prediction of mixed models for longitudinal and clustered data. However, these approaches include several flaws resulting in unbalanced effect selection with falsely induced shrinkage and a low convergence rate on the one hand and biased estimates of the random effects on the other hand. We therefore propose a new boosting algorithm which explicitly accounts for the random structure by excluding it from the selection procedure, properly correcting the random effects estimates and in addition providing likelihood-based estimation of the random effects variance structure. The new algorithm offers an organic and unbiased fitting approach, which is shown via simulations and data examples.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"317-329"},"PeriodicalIF":1.2,"publicationDate":"2021-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0136","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39928476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The method of envelopes to concisely calculate semiparametric efficient scores under parametric restrictions.","authors":"Constantine E Frangakis","doi":"10.1515/ijb-2019-0043","DOIUrl":"https://doi.org/10.1515/ijb-2019-0043","url":null,"abstract":"<p><p>When addressing semiparametric problems with parametric restrictions (assumptions on the distribution), the efficient score (ES) of a parameter is often important for generating useful estimates. However, usual derivation of ES, although conceptually simple, is often lengthy and with many steps that do not help in understanding why its final form arises. This drawback often casts onto semiparametric estimation a mantle that can turn away otherwise able doctoral students or researchers. Here we show that many ESs can be obtained as a one-step derivation after we characterize those features (envelopes) of the unrestricted problem that are constrained in the restricted problem. We demonstrate our arguments in three problems with known ES but whose usual derivations are lengthy. We show that the envelope-based derivation is dramatically explanatory and compact, needing essentially two lines where the standard approach needs 10 or more pages. This suggests that the envelope method can add useful intuition and exegesis to both teaching and research of semiparametric estimation.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 1","pages":"1-5"},"PeriodicalIF":1.2,"publicationDate":"2020-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38995775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A kernel- and optimal transport- based test of independence between covariates and right-censored lifetimes.","authors":"David Rindt, Dino Sejdinovic, David Steinsaltz","doi":"10.1515/ijb-2020-0022","DOIUrl":"https://doi.org/10.1515/ijb-2020-0022","url":null,"abstract":"<p><p>We propose a nonparametric test of independence, termed optHSIC, between a covariate and a right-censored lifetime. Because the presence of censoring creates a challenge in applying the standard permutation-based testing approaches, we use optimal transport to transform the censored dataset into an uncensored one, while preserving the relevant dependencies. We then apply a permutation test using the kernel-based dependence measure as a statistic to the transformed dataset. The type 1 error is proven to be correct in the case where censoring is independent of the covariate. Experiments indicate that optHSIC has power against a much wider class of alternatives than Cox proportional hazards regression and that it has the correct type 1 control even in the challenging cases where censoring strongly depends on the covariate.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"331-348"},"PeriodicalIF":1.2,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0022","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39928478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects of covariates on alternating recurrent events in accelerated failure time models.","authors":"Moumita Chatterjee, Sugata Sen Roy","doi":"10.1515/ijb-2019-0099","DOIUrl":"https://doi.org/10.1515/ijb-2019-0099","url":null,"abstract":"<p><p>In this article, we model alternately occurring recurrent events and study the effects of covariates on each of the survival times. This is done through the accelerated failure time models, where we use lagged event times to capture the dependence over both the cycles and the two events. However, since the errors of the two regression models are likely to be correlated, we assume a bivariate error distribution. Since most event time distributions do not readily extend to bivariate forms, we take recourse to copula functions to build up the bivariate distributions from the marginals. The model parameters are then estimated using the maximum likelihood method and the properties of the estimators studied. A data on respiratory disease is used to illustrate the technique. A simulation study is also conducted to check for consistency.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"295-315"},"PeriodicalIF":1.2,"publicationDate":"2020-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0099","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38588641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian information criterion approximations to Bayes factors for univariate and multivariate logistic regression models.","authors":"Katharina Selig, Pamela Shaw, Donna Ankerst","doi":"10.1515/ijb-2020-0045","DOIUrl":"https://doi.org/10.1515/ijb-2020-0045","url":null,"abstract":"<p><p>Schwarz's criterion, also known as the Bayesian Information Criterion or BIC, is commonly used for model selection in logistic regression due to its simple intuitive formula. For tests of nested hypotheses in independent and identically distributed data as well as in Normal linear regression, previous results have motivated use of Schwarz's criterion by its consistent approximation to the Bayes factor (BF), defined as the ratio of posterior to prior model odds. Furthermore, under construction of an intuitive unit-information prior for the parameters of interest to test for inclusion in the nested models, previous results have shown that Schwarz's criterion approximates the BF to higher order in the neighborhood of the simpler nested model. This paper extends these results to univariate and multivariate logistic regression, providing approximations to the BF for arbitrary prior distributions and definitions of the unit-information prior corresponding to Schwarz's approximation. Simulations show accuracies of the approximations for small samples sizes as well as comparisons to conclusions from frequentist testing. We present an application in prostate cancer, the motivating setting for our work, which illustrates the approximation for large data sets in a practical example.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"241-266"},"PeriodicalIF":1.2,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38637717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James McVittie, David Wolfson, David Stephens, Vittorio Addona, David Buckeridge
{"title":"Parametric models for combined failure time data from an incident cohort study and a prevalent cohort study with follow-up.","authors":"James McVittie, David Wolfson, David Stephens, Vittorio Addona, David Buckeridge","doi":"10.1515/ijb-2020-0042","DOIUrl":"https://doi.org/10.1515/ijb-2020-0042","url":null,"abstract":"<p><p>A classical problem in survival analysis is to estimate the failure time distribution from right-censored observations obtained from an incident cohort study. Frequently, however, failure time data comprise two independent samples, one from an incident cohort study and the other from a prevalent cohort study with follow-up, which is known to produce length-biased observed failure times. There are drawbacks to each of these two types of study when viewed separately. We address two main questions here: (i) Can our statistical inference be enhanced by combining data from an incident cohort study with data from a prevalent cohort study with follow-up? (ii) What statistical methods are appropriate for these combined data? The theory we develop to address these questions is based on a parametrically defined failure time distribution and is supported by simulations. We apply our methods to estimate the duration of hospital stays.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"283-293"},"PeriodicalIF":1.2,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38621094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Targeted design for adaptive clinical trials via semiparametric model.","authors":"Hongbin Zhang, Ao Yuan, Ming T Tan","doi":"10.1515/ijb-2018-0100","DOIUrl":"https://doi.org/10.1515/ijb-2018-0100","url":null,"abstract":"<p><p>Precision medicine approach that assigns treatment according to an individual's personal (including molecular) profile is revolutionizing health care. Existing statistical methods for clinical trial design typically assume a known model to estimate characteristics of treatment outcomes, which may yield biased results if the true model deviates far from the assumed one. This article aims to achieve model robustness in a phase II multi-stage adaptive clinical trial design. We propose and study a semiparametric regression mixture model in which the mixing proportions are specified according to the subjects' profiles, and each sub-group distribution is only assumed to be unimodal for robustness. The regression parameters and the error density functions are estimated by semiparametric maximum likelihood and isotonic regression estimators. The asymptotic properties of the estimates are studied. Simulation studies are conducted to evaluate the performance of the method after a real data analysis.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"177-190"},"PeriodicalIF":1.2,"publicationDate":"2020-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2018-0100","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38463322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Allan Jérolon, Laura Baglietto, Etienne Birmelé, Flora Alarcon, Vittorio Perduca
{"title":"Causal mediation analysis in presence of multiple mediators uncausally related.","authors":"Allan Jérolon, Laura Baglietto, Etienne Birmelé, Flora Alarcon, Vittorio Perduca","doi":"10.1515/ijb-2019-0088","DOIUrl":"https://doi.org/10.1515/ijb-2019-0088","url":null,"abstract":"<p><p>Mediation analysis aims at disentangling the effects of a treatment on an outcome through alternative causal mechanisms and has become a popular practice in biomedical and social science applications. The causal framework based on counterfactuals is currently the standard approach to mediation, with important methodological advances introduced in the literature in the last decade, especially for simple mediation, that is with one mediator at the time. Among a variety of alternative approaches, Imai et al. showed theoretical results and developed an R package to deal with simple mediation as well as with multiple mediation involving multiple mediators conditionally independent given the treatment and baseline covariates. This approach does not allow to consider the often encountered situation in which an unobserved common cause induces a spurious correlation between the mediators. In this context, which we refer to as mediation with uncausally related mediators, we show that, under appropriate hypothesis, the natural direct and joint indirect effects are non-parametrically identifiable. Moreover, we adopt the quasi-Bayesian algorithm developed by Imai et al. and propose a procedure based on the simulation of counterfactual distributions to estimate not only the direct and joint indirect effects but also the indirect effects through individual mediators. We study the properties of the proposed estimators through simulations. As an illustration, we apply our method on a real data set from a large cohort to assess the effect of hormone replacement treatment on breast cancer risk through three mediators, namely dense mammographic area, nondense area and body mass index.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"191-221"},"PeriodicalIF":1.2,"publicationDate":"2020-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0088","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38432917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Marginal quantile regression for longitudinal data analysis in the presence of time-dependent covariates.","authors":"I-Chen Chen, Philip M Westgate","doi":"10.1515/ijb-2020-0010","DOIUrl":"10.1515/ijb-2020-0010","url":null,"abstract":"<p><p>When observations are correlated, modeling the within-subject correlation structure using quantile regression for longitudinal data can be difficult unless a working independence structure is utilized. Although this approach ensures consistent estimators of the regression coefficients, it may result in less efficient regression parameter estimation when data are highly correlated. Therefore, several marginal quantile regression methods have been proposed to improve parameter estimation. In a longitudinal study some of the covariates may change their values over time, and the topic of time-dependent covariate has not been explored in the marginal quantile literature. As a result, we propose an approach for marginal quantile regression in the presence of time-dependent covariates, which includes a strategy to select a working type of time-dependency. In this manuscript, we demonstrate that our proposed method has the potential to improve power relative to the independence estimating equations approach due to the reduction of mean squared error.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 2","pages":"267-282"},"PeriodicalIF":1.0,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8591406/pdf/nihms-1754217.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38429312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georges Bresson, Anoop Chaturvedi, Mohammad Arshad Rahman, Shalabh
{"title":"Seemingly unrelated regression with measurement error: estimation via Markov Chain Monte Carlo and mean field variational Bayes approximation.","authors":"Georges Bresson, Anoop Chaturvedi, Mohammad Arshad Rahman, Shalabh","doi":"10.1515/ijb-2019-0120","DOIUrl":"https://doi.org/10.1515/ijb-2019-0120","url":null,"abstract":"<p><p>Linear regression with measurement error in the covariates is a heavily studied topic, however, the statistics/econometrics literature is almost silent to estimating a multi-equation model with measurement error. This paper considers a seemingly unrelated regression model with measurement error in the covariates and introduces two novel estimation methods: a pure Bayesian algorithm (based on Markov chain Monte Carlo techniques) and its mean field variational Bayes (MFVB) approximation. The MFVB method has the added advantage of being computationally fast and can handle big data. An issue pertinent to measurement error models is parameter identification, and this is resolved by employing a prior distribution on the measurement error variance. The methods are shown to perform well in multiple simulation studies, where we analyze the impact on posterior estimates for different values of reliability ratio or variance of the true unobserved quantity used in the data generating process. The paper further implements the proposed algorithms in an application drawn from the health literature and shows that modeling measurement error in the data can improve model fitting.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"17 1","pages":"75-97"},"PeriodicalIF":1.2,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0120","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38396067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}