Lei Huang, Liwen Su, Yuling Zheng, Yuanyuan Chen, Fangrong Yan
{"title":"Power prior for borrowing the real-world data in bioequivalence test with a parallel design.","authors":"Lei Huang, Liwen Su, Yuling Zheng, Yuanyuan Chen, Fangrong Yan","doi":"10.1515/ijb-2020-0119","DOIUrl":"https://doi.org/10.1515/ijb-2020-0119","url":null,"abstract":"<p><p>Recently, real-world study has attracted wide attention for drug development. In bioequivalence study, the reference drug often has been marketed for many years and accumulated abundant real-world data. It is therefore appealing to incorporate these data in the design to improve trial efficiency. In this paper, we propose a Bayesian method to include real-world data of the reference drug in a current bioequivalence trial, with the aim to increase the power of analysis and reduce sample size for long half-life drugs. We adopt the power prior method for incorporating real-world data and use the average bioequivalence posterior probability to evaluate the bioequivalence between the test drug and the reference drug. Simulations were conducted to investigate the performance of the proposed method in different scenarios. The simulation results show that the proposed design has higher power than the traditional design without borrowing real-world data, while controlling the type I error. Moreover, the proposed method saves sample size and reduces costs for the trial.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"73-82"},"PeriodicalIF":1.2,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38960602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian mixture model for changepoint estimation using ordinal predictors.","authors":"Emily Roberts, Lili Zhao","doi":"10.1515/ijb-2020-0151","DOIUrl":"https://doi.org/10.1515/ijb-2020-0151","url":null,"abstract":"<p><p>In regression models, predictor variables with inherent ordering, such ECOG performance status or novel biomarker expression levels, are commonly seen in medical settings. Statistically, it may be difficult to determine the functional form of an ordinal predictor variable. Often, such a variable is dichotomized based on whether it is above or below a certain cutoff. Other methods conveniently treat the ordinal predictor as a continuous variable and assume a linear relationship with the outcome. However, arbitrarily choosing a method may lead to inaccurate inference and treatment. In this paper, we propose a Bayesian mixture model to consider both dichotomous and linear forms for the variable. This allows for simultaneous assessment of the appropriate form of the predictor in regression models by considering the presence of a changepoint through the lens of a threshold detection problem. This method is applicable to continuous, binary, and survival outcomes, and it is easily amenable to penalized regression. We evaluated the proposed method using simulation studies and apply it to two real datasets. We provide JAGS code for easy implementation.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"57-72"},"PeriodicalIF":1.2,"publicationDate":"2021-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0151","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25564949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian optimization design for finding a maximum tolerated dose combination in phase I clinical trials.","authors":"Ami Takahashi, Taiji Suzuki","doi":"10.1515/ijb-2020-0147","DOIUrl":"https://doi.org/10.1515/ijb-2020-0147","url":null,"abstract":"<p><p>The development of combination therapies has become commonplace because potential synergistic benefits are expected for resistant patients of single-agent treatment. In phase I clinical trials, the underlying premise is toxicity increases monotonically with increasing dose levels. This assumption cannot be applied in drug combination trials, however, as there are complex drug-drug interactions. Although many parametric model-based designs have been developed, strong assumptions may be inappropriate owing to little information available about dose-toxicity relationships. No standard solution for finding a maximum tolerated dose combination has been established. With these considerations, we propose a Bayesian optimization design for identifying a single maximum tolerated dose combination. Our proposed design utilizing Bayesian optimization guides the next dose by a balance of information between exploration and exploitation on the nonparametrically estimated dose-toxicity function, thereby allowing us to reach a global optimum with fewer evaluations. We evaluate the proposed design by comparing it with a Bayesian optimal interval design and with the partial-ordering continual reassessment method. The simulation results suggest that the proposed design works well in terms of correct selection probabilities and dose allocations. The proposed design has high potential as a powerful tool for use in finding a maximum tolerated dose combination.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"39-56"},"PeriodicalIF":1.2,"publicationDate":"2021-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0147","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25560130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"More than one way: exploring the capabilities of different estimation approaches to joint models for longitudinal and time-to-event outcomes.","authors":"Anja Rappl, Andreas Mayr, Elisabeth Waldmann","doi":"10.1515/ijb-2020-0067","DOIUrl":"https://doi.org/10.1515/ijb-2020-0067","url":null,"abstract":"<p><p>The development of physical functioning after a caesura in an aged population is still widely unexplored. Analysis of this topic would need to model the longitudinal trajectories of physical functioning and simultaneously take terminal events (deaths) into account. Separate analysis of both results in biased estimates, since it neglects the inherent connection between the two outcomes. Thus, this type of data generating process is best modelled jointly. To facilitate this several software applications were made available. They differ in model formulation, estimation technique (likelihood-based, Bayesian inference, statistical boosting) and a comparison of the different approaches is necessary to identify their capabilities and limitations. Therefore, we compared the performance of the packages JM, joineRML, JMbayes and JMboost of the R software environment with respect to estimation accuracy, variable selection properties and prediction precision. With these findings we then illustrate the topic of physical functioning after a caesura with data from the German ageing survey (DEAS). The results suggest that in smaller data sets and theory driven modelling likelihood-based methods (expectation maximation, JM, joineRML) or Bayesian inference (JMbayes) are preferable, whereas statistical boosting (JMboost) is a better choice with high-dimensional data and data exploration settings.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"127-149"},"PeriodicalIF":1.2,"publicationDate":"2021-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0067","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25560133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixin Kong, Ariangela Kozik, Cindy H Nakatsu, Yava L Jones-Hall, Hyonho Chun
{"title":"A zero-inflated non-negative matrix factorization for the deconvolution of mixed signals of biological data.","authors":"Yixin Kong, Ariangela Kozik, Cindy H Nakatsu, Yava L Jones-Hall, Hyonho Chun","doi":"10.1515/ijb-2020-0039","DOIUrl":"https://doi.org/10.1515/ijb-2020-0039","url":null,"abstract":"<p><p>A latent factor model for count data is popularly applied in deconvoluting mixed signals in biological data as exemplified by sequencing data for transcriptome or microbiome studies. Due to the availability of pure samples such as single-cell transcriptome data, the accuracy of the estimates could be much improved. However, the advantage quickly disappears in the presence of excessive zeros. To correctly account for this phenomenon in both mixed and pure samples, we propose a zero-inflated non-negative matrix factorization and derive an effective multiplicative parameter updating rule. In simulation studies, our method yielded the smallest bias. We applied our approach to brain gene expression as well as fecal microbiome datasets, illustrating the superior performance of the approach. Our method is implemented as a publicly available R-package, iNMF.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"203-218"},"PeriodicalIF":1.2,"publicationDate":"2021-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25541025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pablo Martínez-Camblor, Sonia Pérez-Fernández, Susana Díaz-Coto
{"title":"The area under the generalized receiver-operating characteristic curve.","authors":"Pablo Martínez-Camblor, Sonia Pérez-Fernández, Susana Díaz-Coto","doi":"10.1515/ijb-2020-0091","DOIUrl":"https://doi.org/10.1515/ijb-2020-0091","url":null,"abstract":"<p><p>The receiver operating-characteristic (ROC) curve is a well-known graphical tool routinely used for evaluating the discriminatory ability of continuous markers, referring to a binary characteristic. The area under the curve (AUC) has been proposed as a summarized accuracy index. Higher values of the marker are usually associated with higher probabilities of having the characteristic under study. However, there are other situations where both, higher and lower marker scores, are associated with a positive result. The generalized ROC (gROC) curve has been proposed as a proper extension of the ROC curve to fit these situations. Of course, the corresponding area under the gROC curve, gAUC, has also been introduced as a global measure of the classification capacity. In this paper, we study in deep the gAUC properties. The weak convergence of its empirical estimator is provided while deriving an explicit and useful expression for the asymptotic variance. We also obtain the expression for the asymptotic covariance of related gAUCs and propose a non-parametric procedure to compare them. The finite-samples behavior is studied through Monte Carlo simulations under different scenarios, presenting a real-world problem in order to illustrate its practical application. The <i>R</i> code functions implementing the procedures are provided as Supplementary Material.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"293-306"},"PeriodicalIF":1.2,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25512815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vivien Goepp, Jean-Christophe Thalabard, Grégory Nuel, Olivier Bouaziz
{"title":"Regularized bidimensional estimation of the hazard rate.","authors":"Vivien Goepp, Jean-Christophe Thalabard, Grégory Nuel, Olivier Bouaziz","doi":"10.1515/ijb-2019-0003","DOIUrl":"https://doi.org/10.1515/ijb-2019-0003","url":null,"abstract":"<p><p>In epidemiological or demographic studies, with variable age at onset, a typical quantity of interest is the incidence of a disease (for example the cancer incidence). In these studies, the individuals are usually highly heterogeneous in terms of dates of birth (the cohort) and with respect to the calendar time (the period) and appropriate estimation methods are needed. In this article a new estimation method is presented which extends classical age-period-cohort analysis by allowing interactions between age, period and cohort effects. We introduce a bidimensional regularized estimate of the hazard rate where a penalty is introduced on the likelihood of the model. This penalty can be designed either to smooth the hazard rate or to enforce consecutive values of the hazard to be equal, leading to a parsimonious representation of the hazard rate. In the latter case, we make use of an iterative penalized likelihood scheme to approximate the <i>L</i><sub>0</sub> norm, which makes the computation tractable. The method is evaluated on simulated data and applied on breast cancer survival data from the SEER program.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"263-277"},"PeriodicalIF":1.2,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2019-0003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25519056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian approaches to variable selection: a comparative study from practical perspectives.","authors":"Zihang Lu, Wendy Lou","doi":"10.1515/ijb-2020-0130","DOIUrl":"https://doi.org/10.1515/ijb-2020-0130","url":null,"abstract":"<p><p>In many clinical studies, researchers are interested in parsimonious models that simultaneously achieve consistent variable selection and optimal prediction. The resulting parsimonious models will facilitate meaningful biological interpretation and scientific findings. Variable selection via Bayesian inference has been receiving significant advancement in recent years. Despite its increasing popularity, there is limited practical guidance for implementing these Bayesian approaches and evaluating their comparative performance in clinical datasets. In this paper, we review several commonly used Bayesian approaches to variable selection, with emphasis on application and implementation through R software. These approaches can be roughly categorized into four classes: namely the Bayesian model selection, spike-and-slab priors, shrinkage priors, and the hybrid of both. To evaluate their variable selection performance under various scenarios, we compare these four classes of approaches using real and simulated datasets. These results provide practical guidance to researchers who are interested in applying Bayesian approaches for the purpose of variable selection.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"83-108"},"PeriodicalIF":1.2,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25525255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating additional knowledge into the estimation of graphical models.","authors":"Yunqi Bu, Johannes Lederer","doi":"10.1515/ijb-2020-0133","DOIUrl":"https://doi.org/10.1515/ijb-2020-0133","url":null,"abstract":"<p><p>Graphical models such as brain connectomes derived from functional magnetic resonance imaging (fMRI) data are considered a prime gateway to understanding network-type processes. We show, however, that standard methods for graphical modeling can fail to provide accurate graph recovery even with optimal tuning and large sample sizes. We attempt to solve this problem by leveraging information that is often readily available in practice but neglected, such as the spatial positions of the measurements. This information is incorporated into the tuning parameter of neighborhood selection, for example, in the form of pairwise distances. Our approach is computationally convenient and efficient, carries a clear Bayesian interpretation, and improves standard methods in terms of statistical stability. Applied to data about Alzheimer's disease, our approach allows us to highlight the central role of lobes in the connectivity structure of the brain and to identify an increased connectivity within the cerebellum for Alzheimer's patients compared to other subjects.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"1-17"},"PeriodicalIF":1.2,"publicationDate":"2021-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0133","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25506376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple scaled symmetric distributions in allometric studies.","authors":"Antonio Punzo, Luca Bagnato","doi":"10.1515/ijb-2020-0059","DOIUrl":"https://doi.org/10.1515/ijb-2020-0059","url":null,"abstract":"<p><p>In allometric studies, the joint distribution of the log-transformed morphometric variables is typically symmetric and with heavy tails. Moreover, in the bivariate case, it is customary to explain the morphometric variation of these variables by fitting a convenient line, as for example the first principal component (PC). To account for all these peculiarities, we propose the use of multiple scaled symmetric (MSS) distributions. These distributions have the advantage to be directly defined in the PC space, the kind of symmetry involved is less restrictive than the commonly considered elliptical symmetry, the behavior of the tails can vary across PCs, and their first PC is less sensitive to outliers. In the family of MSS distributions, we also propose the multiple scaled shifted exponential normal distribution, equivalent of the multivariate shifted exponential normal distribution in the MSS framework. For the sake of parsimony, we also allow the parameter governing the leptokurtosis on each PC, in the considered MSS distributions, to be tied across PCs. From an inferential point of view, we describe an EM algorithm to estimate the parameters by maximum likelihood, we illustrate how to compute standard errors of the obtained estimates, and we give statistical tests and confidence intervals for the parameters. We use artificial and real allometric data to appreciate the advantages of the MSS distributions over well-known elliptically symmetric distributions and to compare the robustness of the line from our models with respect to the lines fitted by well-established robust and non-robust methods available in the literature.</p>","PeriodicalId":49058,"journal":{"name":"International Journal of Biostatistics","volume":"18 1","pages":"219-242"},"PeriodicalIF":1.2,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1515/ijb-2020-0059","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25487581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}