{"title":"Asymptotic cumulants of some information criteria","authors":"H. Ogasawara","doi":"10.5183/JJSCS.1512001_225","DOIUrl":"https://doi.org/10.5183/JJSCS.1512001_225","url":null,"abstract":"Asymptotic cumulants of the Akaike and Takeuchi information criteria are given under possible model misspecification up to the fourth order with the higher-order asymptotic variances, where two versions of the latter information criterion are defined using observed and estimated expected information matrices. The asymptotic cumulants are provided before and after studentization using the parameter estimators by the weighted-score method, which include the maximum likelihood and Bayes modal estimators as special cases. Higher-order bias corrections of the criteria are derived using log-likelihood derivatives, which yields simple results for cases under canonical parametrization in the exponential family. It is shown that in these cases the Jeffreys prior gives the vanishing higher-order bias of the Akaike information criterion. The results are illustrated by three examples. Simulations for model selection in regression and interval estimation are also given.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123169438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ADHERENTLY PENALIZED LINEAR DISCRIMINANT ANALYSIS","authors":"H. Hino, Jun Fujiki","doi":"10.5183/JJSCS.1412001_219","DOIUrl":"https://doi.org/10.5183/JJSCS.1412001_219","url":null,"abstract":"A problem of supervised learning in which the data consist of p features and n observations is considered. Each observation is assumed to belong to either one of the two classes. Linear discriminant analysis (LDA) has been widely used for both classification and dimensionality reduction in this setting. However, when the dimensionality p is high and the observations are scarce, LDA does not offer a satisfactory result for classification. Witten & Tibshirani (2011) proposed the penalized LDA based on the Fisher’s discriminant problem with sparsity penalization. In this paper, an elastic-net type penalization is considered for LDA, and the corresponding optimization problem is efficiently solved.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132256875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"POWER CALCULATIONS IN CLINICAL TRIALS WITH COMPLEX CLINICAL OBJECTIVES","authors":"A. Dmitrienko, G. Paux, T. Brechenmacher","doi":"10.5183/JJSCS.1411001_213","DOIUrl":"https://doi.org/10.5183/JJSCS.1411001_213","url":null,"abstract":"Over the past decade, a variety of powerful multiple testing procedures have been developed for the analysis of clinical trials with multiple clinical objectives based, for example, on several endpoints, dose-placebo comparisons and patient subgroups. Sample size and power calculations in these complex settings are not straightforward and, in general, simulation-based methods are used. In this paper, we provide an overview of power evaluation approaches in the context of clinical trials with multiple objectives and illustrate the key principles using case studies commonly seen in the development of new therapies.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131260606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SPARSE PREDICTIVE MODELING FOR BANK TELEMARKETING SUCCESS USING SMOOTH-THRESHOLD ESTIMATING EQUATIONS","authors":"Y. Kawasaki, Masao Ueki","doi":"10.5183/JJSCS.1502003_217","DOIUrl":"https://doi.org/10.5183/JJSCS.1502003_217","url":null,"abstract":"In this paper, we attempt to build and evaluate several predictive models to predict success of telemarketing calls for selling bank long-term deposits using a publicly available set of data from a Portuguese retail bank collected from 2008 to 2013 (Moro et al., 2014, Decision Support Systems). The data include multiple predictor variables, either numeric or categorical, related with bank client, product and social-economic attributes. Dealing with a categorical predictor variable as multiple dummy variables increases model dimensionality, and redundancy in model parameterization must be of practical concern. This motivates us to assess prediction performance with more parsimonious modeling. We apply contemporary variable selection methods with penalization including lasso, elastic net, smoothly-clipped absolute deviation, minimum concave penalty as well as the smooth-threshold estimating equation. In addition to variable selection, the smooth-threshold estimating equation can achieve automatic grouping of predictor variables, which is an alternative sparse modeling to perform variable selection and could be suited to a certain problem, e.g., dummy variables created from categorical predictor variables. Predictive power of each modeling approach is assessed by repeating cross-validation experiments or sample splitting, one for training and another for testing.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122086869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ESTIMATING SCALE-FREE NETWORKS VIA THE EXPONENTIATION OF MINIMAX CONCAVE PENALTY","authors":"K. Hirose, Y. Ogura, Hidetoshi Shimodaira","doi":"10.5183/JJSCS.1503001_215","DOIUrl":"https://doi.org/10.5183/JJSCS.1503001_215","url":null,"abstract":"We consider the problem of sparse estimation of undirected graphical models via the L1 regularization. The ordinary lasso encourages the sparsity on all edges equally likely, so that all nodes tend to have small degrees. On the other hand, many real-world networks are often scale-free, where some nodes have a large number of edges. In such cases, a penalty that induces structured sparsity, such as a log penalty, performs better than the ordinary lasso. In practical situations, however, it is difficult to determine an optimal penalty among the ordinary lasso, log penalty, or somewhere in between. In this paper, we introduce a new class of penalty that is based on the exponentiation of the minimax concave penalty. The proposed penalty includes both the lasso and the log penalty, and the gap between these two penalties is bridged by a tuning parameter. We apply cross-validation to select an appropriate value of the tuning parameter. Monte Carlo simulations are conducted to investigate the performance of our proposed procedure. The numerical result shows that the proposed method can perform better than the existing log penalty and the ordinary lasso.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133107462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STOCHASTIC ALTERNATING DIRECTION METHOD OF MULTIPLIERS FOR STRUCTURED REGULARIZATION","authors":"Taiji Suzuki","doi":"10.5183/JJSCS.1502004_218","DOIUrl":"https://doi.org/10.5183/JJSCS.1502004_218","url":null,"abstract":"In this paper, we present stochastic optimization variants of the alternating direction method of multipliers (ADMM). ADMM is a useful method to solve a regularized risk minimization problem where the regularization term is complicated and not easily dealt with in an ordinary manner. For example, structured regularization is one of the typical applications of such regularization in which ADMM is effective. It includes group lasso regularization, low rank tensor regularization, and fused lasso regularization. Since ADMM is a general method and has wide applications, it is intensively studied and refined these days. However, ADMM is not suited to optimization problems with huge data. To resolve this problem, online stochastic optimization variants and a batch stochastic optimization variant of ADMM are presented. All the presented methods can be easily implemented and have wide applications. Moreover, the theoretical guarantees of the methods are given.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124668964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuichi Kawano, Ibuki Hoshina, Kaito Shimamura, S. Konishi
{"title":"PREDICTIVE MODEL SELECTION CRITERIA FOR BAYESIAN LASSO REGRESSION","authors":"Shuichi Kawano, Ibuki Hoshina, Kaito Shimamura, S. Konishi","doi":"10.5183/JJSCS.1501001_220","DOIUrl":"https://doi.org/10.5183/JJSCS.1501001_220","url":null,"abstract":"We consider the Bayesian lasso for regression, which can be interpreted as an L 1 norm regularization based on a Bayesian approach when the Laplace or double-exponential prior distribution is placed on the regression coefficients. A crucial issue is an appropriate choice of the values of hyperparameters included in the prior distributions, which essentially control the sparsity in the estimated model. To choose the values of tuning parameters, we introduce a model selection criterion for evaluating a Bayesian predictive distribution for the Bayesian lasso. Numerical results are presented to illustrate the properties of our sparse Bayesian modeling procedure.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114583956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EDITORIAL: RECENT ADVANCES IN SPARSE STATISTICAL MODELING","authors":"K. Hirose","doi":"10.5183/JJSCS.1510002_225","DOIUrl":"https://doi.org/10.5183/JJSCS.1510002_225","url":null,"abstract":"The first term L(β) is a loss function and the second term λ ∑p j=1 |βj | is a penalty term. Here λ (λ > 0) is a tuning parameter which controls the sparsity and the model fitting. Because the penalty term consists of the sum of absolute values of the parameter, we can carry out the sparse estimation, that is, some of the elements of β are estimated by exactly zeros. It is well-known that we cannot often obtain the analytical solutions of the minimization problem (1), because the penalty term λ ∑p j=1 |βj | is indifferentiable when βj = 0 (j = 1, . . . , p). Therefore, it is important to develop efficient computational algorithms. This special issue includes six interesting papers related to sparse estimation. These papers cover a wide variety of topics, such as statistical modeling, computation, theoretical analysis, and applications. In particular, all of the papers deal with the issue of statistical computation. Kawasaki and Ueki (the first paper of this issue) apply smooth-threshold estimating equations (STEE, Ueki, 2009) to telemarketing success data collected from a Portuguese retail bank. In STEE, the penalty term consists of a quadratic form ∑p j=1 wjβ 2 j instead of ∑p j=1 |βj |, where wj (j = 1, . . . , p) are positive values allowed to be ∞, so that we do not need to implement a computational algorithm that is used in the L1 regularization. Kawano, Hoshina, Shimamura and Konishi (the second paper) propose a model selection criterion for choosing tuning parameters in the Bayesian lasso (Park and Casella, 2008). They use an efficient sparse estimation algorithm in the Bayesian lasso, referred to as the sparse algorithm. Matsui (the third paper) considers the problem of bi-level selection, which allows the selection of groups of variables and individuals simultaneously. The parameter estimation procedure is based on the coordinate descent algorithm, which is known as a remarkably fast algorithm (Friedman et al., 2010). Suzuki (the fourth paper) focuses attention on the alternating direction method of multipliers algorithm (ADMM algorithm, Boyd et al., 2011), which is applicable to various complex penalties such as the overlapping group lasso (Jacob et al., 2009). He reviews a stochastic version of the ADMM algorithm that allows the online learning. Hino and Fujiki (the fifth paper) propose a penalized linear discriminant analysis that adheres to the normal discriminant model. They apply the Majorize-Minimization algorithm (MM algorithm, Hunter and Lange 2004), which is often used to replace a non-convex optimization problem with a reweighted convex optimization","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117292893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SPARSE REGULARIZATION FOR BI-LEVEL VARIABLE SELECTION","authors":"H. Matsui","doi":"10.5183/JJSCS.1502001_216","DOIUrl":"https://doi.org/10.5183/JJSCS.1502001_216","url":null,"abstract":"Sparse regularization provides solutions in which some parameters are exactly zero and therefore they can be used for selecting variables in regression models and so on. The lasso is proposed as a method for selecting individual variables for regression models. On the other hand, the group lasso selects groups of variables rather than individuals and therefore it has been used in various fields of applications. More recently, penalties that select variables at both the group and individual levels has been considered. They are so called bi-level selection. In this paper we focus on some penalties that aim for bi-level selection. We overview these penalties and estimation algorithms, and then compare the effectiveness of these penalties from the viewpoint of accuracy of prediction and selection of variables and groups through simulation studies.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127828020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A SEQUENTIAL MULTIPLE COMPARISON PROCEDURE FOR DETECTING A LOWEST DOSE HAVING INTERACTION IN A DOSE-RESPONSE TEST","authors":"Tomohiro Nakamura, H. Douke","doi":"10.5183/JJSCS.1406001_212","DOIUrl":"https://doi.org/10.5183/JJSCS.1406001_212","url":null,"abstract":"In this study, we propose a multiple comparison procedure for detecting sequentially a lowest dose level having interaction based on two dose sample means on two treatments with increasing dose levels in a dose-response test. We apply a group sequential procedure in order to realize our method that tests sequentially the null hypotheses of no interaction based on tetrad differences. If we can first detect a dose level having interaction at an early stage in the sequential test, since we can terminate the procedure with just the few observations up to that stage, the procedure is useful from an economical point of view. In the procedure, we present an integral formula to determine the repeated confidence boundaries for satisfying a predefined type I familywise error rate. Furthermore, we show how to decide a required sample size in each cell so as to guarantee the power of the test. In the simulation studies, we evaluate the superiority among the procedures based on three α spending functions in terms of the power of the test and the required sample size for various configurations of population means.","PeriodicalId":338719,"journal":{"name":"Journal of the Japanese Society of Computational Statistics","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134152891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}