{"title":"The Two Sample Problem for Multiple Categorical Variables","authors":"A. DiRienzo","doi":"10.2202/1557-4679.1019","DOIUrl":"https://doi.org/10.2202/1557-4679.1019","url":null,"abstract":"Comparing two large multivariate distributions is potentially complicated at least for the following reasons. First, some variable/level combinations may have a redundant difference in prevalence between groups in the sense that the difference can be completely explained in terms of lower-order combinations. Second, the total number of variable/level combinations to compare between groups is very large, and likely computationally prohibitive. In this paper, for both the paired and independent sample case, an approximate comparison method is proposed, along with a computationally efficient algorithm, that estimates the set of variable/level combinations that have a non-redundant different prevalence between two populations. The probability that the estimate contains one or more false or redundant differences is asymptotically bounded above by any pre-specified level for arbitrary data-generating distributions. The method is shown to perform well for finite samples in a simulation study, and is used to investigate HIV-1 genotype evolution in a recent AIDS clinical trial.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparing Distribution Functions via Empirical Likelihood","authors":"I. McKeague, Yichuan Zhao","doi":"10.2202/1557-4679.1007","DOIUrl":"https://doi.org/10.2202/1557-4679.1007","url":null,"abstract":"This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Regression Model for Dependent Gap Times","authors":"R. Strawderman","doi":"10.2202/1557-4679.1005","DOIUrl":"https://doi.org/10.2202/1557-4679.1005","url":null,"abstract":"A natural choice of time scale for analyzing recurrent event data is the ``gap\" (or soujourn) time between successive events. In many situations it is reasonable to assume correlation exists between the successive events experienced by a given subject. This paper looks at the problem of extending the accelerated failure time (AFT) model to the case of dependent recurrent event data via intensity modeling. Specifically, the accelerated gap times model of Strawderman (2005), a semiparametric intensity model for independent gap time data, is extended to the case of multiplicative gamma frailty. As argued in Aalen & Husebye (1991), incorporating frailty captures the heterogeneity between subjects and the ``hazard\" portion of the intensity model captures gap time variation within a subject. Estimators are motivated using semiparametric efficiency theory and lead to useful generalizations of the rank statistics considered in Strawderman (2005). Several interesting distinctions arise in comparison to the Cox-Andersen-Gill frailty model (e.g., Nielsen et al, 1992; Klein, 1992). The proposed methodology is illustrated by simulation and data analysis.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens","authors":"M. J. van der Laan, M. Petersen, M. Joffe","doi":"10.2202/1557-4679.1003","DOIUrl":"https://doi.org/10.2202/1557-4679.1003","url":null,"abstract":"Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a treatment. These models, introduced by Robins, model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at a final time point. However, the utility of these models for some applications has been limited by their inability to incorporate modification of the causal effect of treatment by time-varying covariates. Particularly in the context of clinical decision making, such time-varying effect modifiers are often of considerable or even primary interest, as they are used in practice to guide treatment decisions for an individual. In this article we propose a generalization of marginal structural models, which we call history-adjusted marginal structural models (HA-MSM). These models allow estimation of adjusted causal effects of treatment, given the observed past, and are therefore more suitable for making treatment decisions at the individual level and for identification of time-dependent effect modifiers. Specifically, a HA-MSM models the conditional distribution of treatment-specific counterfactual outcomes, conditional on the whole or a subset of the observed past up till a time-point, simultaneously for all time-points. Double robust inverse probability of treatment weighted estimators have been developed and studied in detail for standard MSM. We extend these results by proposing a class of double robust inverse probability of treatment weighted estimators for the unknown parameters of the HA-MSM. In addition, we show that HA-MSM provide a natural approach to identifying the dynamic treatment regimen which follows, at each time-point, the history-adjusted (up till the most recent time point) optimal static treatment regimen. We illustrate our results using an example drawn from the treatment of HIV infection.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics","authors":"M. Banerjee, J. Wellner","doi":"10.2202/1557-4679.1001","DOIUrl":"https://doi.org/10.2202/1557-4679.1001","url":null,"abstract":"In this paper we introduce three natural ``score statistics\" for testing the hypothesis that F(t_0)takes on a fixed value in the context of nonparametric inference with current status data. These three new test statistics have natural interpretations in terms of certain (weighted) L_2 distances, and are also connected to natural ``one-sided\" scores. We compare these new test statistics with the analogue of the classical Wald statistic and the likelihood ratio statistic introduced in Banerjee and Wellner (2001) for the same testing problem. Under classical ``regular\" statistical problems the likelihood ratio, score, and Wald statistics all have the same chi-squared limiting distribution under the null hypothesis. In sharp contrast, in this non-regular problem all three statistics have different limiting distributions under the null hypothesis. Thus we begin by establishing the limit distribution theory of the statistics under the null hypothesis, and discuss calculation of the relevant critical points for the test statistics. Once the null distribution theory is known, the immediate question becomes that of power. We establish the limiting behavior of the three types of statistics under local alternatives. We have also compared the power of these five different statistics via a limited Monte-Carlo study. Our conclusions are: (a) the Wald statistic is less powerful than the likelihood ratio and score statistics; and (b) one of the score statistics may have more power than the likelihood ratio statistic for some alternatives.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some Variants of the Backcalculation Method for Estimation of Disease Incidence: An Application to Multiple Sclerosis Data from the Faroe Islands","authors":"N. Jewell, B. Lu","doi":"10.2202/1557-4679.1002","DOIUrl":"https://doi.org/10.2202/1557-4679.1002","url":null,"abstract":"Backcalculation is a technique that was originally developed for the study of HIV incidence. Here we introduce some variants of the estimation technique that allow for (i) correlation of the unobserved disease incidence counts, and (ii) the use of a smoothing step as part of the maximizing step in the EM algorithm to reduce instability due to small diagnosis counts. Both of these issues can be important in the analysis of small \"epidemics.\" In addition, identification of correlation between diagnosis counts provides indirect evidence of correlation among unobserved incidence counts, hinting at the possibility of an infectious agent. We illustrate the ideas by reconstructing an incidence intensity function for the onset of multiple sclerosis, using data from the Faroe Islands. Previously, this data had been examined statistically, by Joseph, Wolfson & Wolfson (1990), to address the issue of infectiousness of multiple sclerosis. We argue that the incidence function cannot directly shed light on the enigmatic origin of multiple sclerosis in the Faroe Islands during World War II, and, in particular, cannot discriminate between hypotheses of an infectious or environmental agent.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Weighted Risk Set Estimator for Survival Distributions in Two-Stage Randomization Designs with Censored Survival Data","authors":"Xiang Guo, A. Tsiatis","doi":"10.2202/1557-4679.1000","DOIUrl":"https://doi.org/10.2202/1557-4679.1000","url":null,"abstract":"In many clinical trials related to diseases such as cancers and HIV, patients are treated by different combinations of therapies. This leads to two-stage designs, where patients are initially randomized to a primary therapy and then depending on disease remission and patients' consent, a maintenance therapy will be randomly assigned. In such designs, the effects of different treatment policies, i.e., combinations of primary and maintenance therapy are of great interest. In this paper, we propose an estimator for the survival distribution for each treatment policy in such two-stage studies with right-censoring using the method of weighted estimation equations within risk sets. We also derive the large-sample properties. The method is demonstrated and compared with other estimators through simulations and applied to analyze a two-stage randomized study with leukemia patients.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1000","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relationship between Derivatives of the Observed and Full Loglikelihoods and Application to Newton-Raphson Algorithm","authors":"D. Commenges, V. Rondeau","doi":"10.2202/1557-4679.1010","DOIUrl":"https://doi.org/10.2202/1557-4679.1010","url":null,"abstract":"In the case of incomplete data we give general relationships between the first and second derivatives of the loglikelihood relative to the full and the incomplete observation set-ups. In the case where these quantities are easy to compute for the full observation set-up we propose to compute their analogue for the incomplete observation set-up using the above mentioned relationships: this involves numerical integrations. Once we are able to compute these quantities, Newton-Raphson type algorithms can be applied to find the maximum likelihood estimators, together with estimates of their variances. We detail the application of this approach to parametric multiplicative frailty models and we show that the method works well in practice using both a real data and a simulated example. The proposed algorithm outperforms a Newton-Raphson type algorithm using numerical derivatives.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approximate Power and Sample Size Calculations with the Benjamini-Hochberg Method","authors":"J. A. Ferreira, A. Zwinderman","doi":"10.2202/1557-4679.1018","DOIUrl":"https://doi.org/10.2202/1557-4679.1018","url":null,"abstract":"We provide a method for calculating the sample size required to attain a given average power (the ratio of rejected hypotheses to the number of false hypotheses) and a given false discovery rate (the number of incorrect rejections divided by the number of rejections) in adaptive versions of the Benjamini-Hochberg method of multiple testing. The method works in an asymptotic sense as the number of hypotheses grows to infinity and under quite general conditions, and it requires data from a pilot study. The consistency of the method follows from several results in classical areas of nonparametric statistics developed in a new context of \"weak\" dependence.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}