{"title":"A modification of McFadden's R2 for binary and ordinal response models","authors":"E. Ugba, J. Gertheiss","doi":"10.29220/CSAM.2023.30.1.049","DOIUrl":"https://doi.org/10.29220/CSAM.2023.30.1.049","url":null,"abstract":"A lot of studies on the summary measures of predictive strength of categorical response models consider the likelihood ratio index (LRI), also known as the McFadden-$R^2$, a better option than many other measures. We propose a simple modification of the LRI that adjusts for the effect of the number of response categories on the measure and that also rescales its values, mimicking an underlying latent measure. The modified measure is applicable to both binary and ordinal response models fitted by maximum likelihood. Results from simulation studies and a real data example on the olfactory perception of boar taint show that the proposed measure outperforms most of the widely used goodness-of-fit measures for binary and ordinal models. The proposed $R^2$ interestingly proves quite invariant to an increasing number of response categories of an ordinal model.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48518821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A response probability estimation for non-ignorable non-response","authors":"H. Chung, Key-il Shin","doi":"10.29220/csam.2022.29.2.263","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.263","url":null,"abstract":"Use of appropriate technique for non-response occurring in sample survey improves the accuracy of the estimation. Many studies have been conducted for handling non-ignorable non-response and commonly the response probability is estimated using the propensity score method. Recently, post-stratification method to obtain the response probability proposed by Chung and Shin (2017) reduces the effect of bias and gives a good performance in terms of the MSE. In this study, we propose a new response probability estimation method by combining the propensity score adjustment method using the logistic regression model with post-stratification method used in Chung and Shin (2017). The superiority of the proposed method is confirmed through simulation.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48671255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering non-stationary advanced metering infrastructure data","authors":"Dong-Gyun Kang, Yaeji Lim","doi":"10.29220/csam.2022.29.2.225","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.225","url":null,"abstract":"In this paper, we propose a clustering method for advanced metering infrastructure (AMI) data in Korea. As AMI data presents non-stationarity, we consider time-dependent frequency domain principal components analysis, which is a proper method for locally stationary time series data. We develop a new clustering method based on time-varying eigenvectors, and our method provides a meaningful result that is di ff erent from the clustering results obtained by employing conventional methods, such as K -means and K -centres functional clustering. Simulation study demonstrates the superiority of the proposed approach. We further apply the clustering results to the evaluation of the electricity price system in South Korea, and validate the reform of the progressive electricity tari ff system.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46002108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review and comparison of convolution neural network models under a unified framework","authors":"Jimin Park, Yoonsuh Jung","doi":"10.29220/csam.2022.29.2.161","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.161","url":null,"abstract":"There has been active research in image classification using deep learning convolutional neural network (CNN) models. ImageNet large-scale visual recognition challenge (ILSVRC) (2010-2017) was one of the most important competitions that boosted the development of e ffi cient deep learning algorithms. This paper introduces and compares six monumental models that achieved high prediction accuracy in ILSVRC. First, we provide a review of the models to illustrate their unique structure and characteristics of the models. We then compare those models under a unified framework. For this reason, additional devices that are not crucial to the structure are excluded. Four popular data sets with di ff erent characteristics are then considered to measure the prediction accuracy. By investigating the characteristics of the data sets and the models being compared, we provide some insight into the architectural features of the models.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44520076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bayesian joint model for continuous and zero-inflated count data in developmental toxicity studies","authors":"B. Hwang","doi":"10.29220/csam.2022.29.2.239","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.239","url":null,"abstract":"In many applications, we frequently encounter correlated multiple outcomes measured on the same subject. Joint modeling of such multiple outcomes can improve e ffi ciency of inference compared to independent modeling. For instance, in developmental toxicity studies, fetal weight and number of malformed pups are measured on the pregnant dams exposed to di ff erent levels of a toxic substance, in which the association between such outcomes should be taken into account in the model. The number of malformations may possibly have many zeros, which should be analyzed via zero-inflated count models. Motivated by applications in developmental toxicity studies, we propose a Bayesian joint modeling framework for continuous and count outcomes with excess zeros. In our model, zero-inflated Poisson (ZIP) regression model would be used to describe count data, and a subject-specific random e ff ects would account for the correlation across the two outcomes. We implement a Bayesian approach using MCMC procedure with data augmentation method and adaptive rejection sampling. We apply our proposed model to dose-response analysis in a developmental toxicity study to estimate the benchmark dose in a risk assessment.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42556690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning fair prediction models with an imputed sensitive variable: Empirical studies","authors":"Yongdai Kim, Hwichang Jeong","doi":"10.29220/csam.2022.29.2.251","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.251","url":null,"abstract":"As AI has a wide range of influence on human social life, issues of transparency and ethics of AI are emerg-ing. In particular, it is widely known that due to the existence of historical bias in data against ethics or regulatory frameworks for fairness, trained AI models based on such biased data could also impose bias or unfairness against a certain sensitive group (e.g., non-white, women). Demographic disparities due to AI, which refer to socially unacceptable bias that an AI model favors certain groups (e.g., white, men) over other groups (e.g., black, women), have been observed frequently in many applications of AI and many studies have been done recently to develop AI algorithms which remove or alleviate such demographic disparities in trained AI models. In this paper, we consider a problem of using the information in the sensitive variable for fair prediction when using the sensitive variable as a part of input variables is prohibitive by laws or regulations to avoid unfairness. As a way of reflecting the information in the sensitive variable to prediction, we consider a two-stage procedure. First, the sensitive variable is fully included in the learning phase to have a prediction model depending on the sensitive variable, and then an imputed sensitive variable is used in the prediction phase. The aim of this paper is to evaluate this procedure by analyzing several benchmark datasets. We illustrate that using an imputed sensitive variable is helpful to improve prediction accuracies without hampering the degree of fairness much.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41495243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian inference of the cumulative logistic principal component regression models","authors":"Minjung Kyung","doi":"10.29220/csam.2022.29.2.203","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.203","url":null,"abstract":"We propose a Bayesian approach to cumulative logistic regression model for the ordinal response based on the orthogonal principal components via singular value decomposition considering the multicollinearity among predictors. The advantage of the suggested method is considering dimension reduction and parameter estimation simultaneously. To evaluate the performance of the proposed model we conduct a simulation study with considering a high-dimensional and highly correlated explanatory matrix. Also, we fit the suggested method to a real data concerning sproutand scab-damaged kernels of wheat and compare it to EM based proportional-odds logistic regression model. Compared to EM based methods, we argue that the proposed model works better for the highly correlated high-dimensional data with providing parameter estimates and provides good predictions.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48193100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Repair policies of failure detection equipments and system availability","authors":"Seongryong Na, Sunghoon Bang","doi":"10.29220/csam.2022.29.2.151","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.151","url":null,"abstract":"The total system is composed of the main system (MS) and the failure detection equipment (FDE) which detects failures of MS. The analysis of system reliability is performed when the failure of FDE is possible. Several repair policies are considered to determine the order of repair of failed systems, which are sequential repair (SQ), priority repair (PR), independent repair (ID), and simultaneous repair (SM). The states of MS-FDE systems are represented by Markov models according to repair policies and the main purpose of this paper is to derive the system availabilities of the Markov models. Analytical solutions of the stationary equations are derived for the Markov models and the system availabilities are immediately determined using the stationary solutions. A simple illustrative example is discussed for the comparison of availability values of the repair policies considered in this paper.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42011227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring modern machine learning methods to improve causal-effect estimation","authors":"Yeji Kim, Tae-Kil Choi, Sangbum Choi","doi":"10.29220/csam.2022.29.2.177","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.177","url":null,"abstract":"This paper addresses the use of machine learning methods for causal estimation of treatment effects from observational data. Even though conducting randomized experimental trials is a gold standard to reveal potential causal relationships, observational study is another rich source for investigation of exposure effects, for example, in the research of comparative effectiveness and safety of treatments, where the causal effect can be identified if covariates contain all confounding variables. In this context, statistical regression models for the expected outcome and the probability of treatment are often imposed, which can be combined in a clever way to yield more efficient and robust causal estimators. Recently, targeted maximum likelihood estimation and causal random forest is proposed and extensively studied for the use of data-adaptive regression in estimation of causal inference parameters. Machine learning methods are a natural choice in these settings to improve the quality of the final estimate of the treatment effect. We explore how we can adapt the design and training of several machine learning algorithms for causal inference and study their finite-sample performance through simulation experiments under various scenarios. Application to the percutaneous coronary intervention (PCI) data shows that these adaptations can improve simple linear regression-based methods.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48581746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Immediate solution of EM algorithm for non-blind image deconvolution","authors":"Seung-Gu Kim","doi":"10.29220/csam.2022.29.2.277","DOIUrl":"https://doi.org/10.29220/csam.2022.29.2.277","url":null,"abstract":"Due to the uniquely slow convergence speed of the EM algorithm, it su ff ers form a lot of processing time until the desired deconvolution image is obtained when the image is large. To cope with the problem, in this paper, an immediate solution of the EM algorithm is provided under the Gaussian image model. It is derived by finding the recurrent formular of the EM algorithm and then substituting the results repeatedly. In this paper, two types of immediate soultion of image deconboution by EM algorithm are provided, and both methods have been shown to work well. It is expected that it free the processing time of image deconvolution because it no longer requires an iterative process. Based on this, we can find the statistical properties of the restored image at specific iterates. We demonstrate the e ff ectiveness of the proposed method through a simple experiment, and discuss future concerns.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45050866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}