{"title":"A pooled Bayes test of independence using restricted pooling model for contingency tables from small areas","authors":"A. Jo, D. Kim","doi":"10.29220/csam.2022.29.5.547","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.547","url":null,"abstract":"For a chi-squared test, which is a statistical method used to test the independence of a contingency table of two factors, the expected frequency of each cell must be greater than 5. The percentage of cells with an expected frequency below 5 must be less than 20% of all cells. However, there are many cases in which the regional expected frequency is below 5 in general small area studies. Even in large-scale surveys, it is di ffi cult to forecast the expected frequency to be greater than 5 when there is small area estimation with subgroup analysis. Another statistical method to test independence is to use the Bayes factor, but since there is a high ratio of data dependency due to the nature of the Bayesian approach, the low expected frequency tends to decrease the precision of the test results. To overcome these limitations, we will borrow information from areas with similar characteristics and pool the data statistically to propose a pooled Bayes test of independence in target areas. Jo et al. (2021) suggested hierarchical Bayesian pooling models for small area estimation of categorical data, and we will introduce the pooled Bayes factors calculated by expanding their restricted pooling model. We applied the pooled Bayes factors using bone mineral density and body mass index data from the Third National Health and Nutrition Examination Survey conducted in the United States and compared them with chi-squared tests often used in tests of independence.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44235913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Su Hyeong Yang, S. Shin, Woo-Chang Sung, Choon Won Lee
{"title":"Naive Bayes classifiers boosted by sufficient dimension reduction: applications to top-k classification","authors":"Su Hyeong Yang, S. Shin, Woo-Chang Sung, Choon Won Lee","doi":"10.29220/csam.2022.29.5.603","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.603","url":null,"abstract":"The naive Bayes classifier is one of the most straightforward classification tools and directly estimates the class probability. However, because it relies on the independent assumption of the predictor, which is rarely satisfied in real-world problems, its application is limited in practice. In this article, we propose employing su ffi cient dimension reduction (SDR) to substantially improve the performance of the naive Bayes classifier, which is often deteriorated when the number of predictors is not restrictively small. This is not surprising as SDR reduces the predictor dimension without sacrificing classification information, and predictors in the reduced space are constructed to be uncorrelated. Therefore, SDR leads the naive Bayes to no longer be naive. We applied the proposed naive Bayes classifier after SDR to build a recommendation system for the eyewear-frames based on customers’ face shape, demonstrating its utility in the top- k classification problem.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48107642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of covariance thresholding methods in gene set analysis","authors":"Sora Park, Kipoong Kim, Hokeun Sun","doi":"10.29220/csam.2022.29.5.591","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.591","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44530477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intensive comparison of semi-parametric and non-parametric dimension reduction methods in forward regression","authors":"Minju Shin, J. Yoo","doi":"10.29220/csam.2022.29.5.615","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.615","url":null,"abstract":"Principal Fitted Component (PFC) is a semi-parametric su ffi cient dimension reduction (SDR) method, which is originally proposed in Cook (2007). According to Cook (2007), the PFC has a connection with other usual non-parametric SDR methods. The connection is limited to sliced inverse regression (Li, 1991) and ordinary least squares. Since there is no direct comparison between the two approaches in various forward regressions up to date, a practical guidance between the two approaches is necessary for usual statistical practitioners. To fill this practical necessity, in this paper, we newly derive a connection of the PFC to covariance methods (Yin and Cook, 2002), which is one of the most popular SDR methods. Also, intensive numerical studies have done closely to examine and compare the estimation performances of the semi- and non-parametric SDR methods for various forward regressions. The founding from the numerical studies are confirmed in a real data example.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45130520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time prediction for multi-wave COVID-19 outbreaks","authors":"F. Zuhairoh, D. Rosadi","doi":"10.29220/csam.2022.29.5.499","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.499","url":null,"abstract":"Intervention measures have been implemented worldwide to reduce the spread of the COVID-19 outbreak. The COVID-19 outbreak has occured in several waves of infection, so this paper is divided into three groups, namely those countries who have passed the pandemic period, those countries who are still experiencing a single -wave pandemic, and those countries who are experiencing a multi-wave pandemic. The purpose of this study is to develop a multi-wave Richards model with several changepoint detection methods so as to obtain more accurate prediction results, especially for the multi-wave case. We investigated epidemiological trends in different countries from January 2020 to October 2021 to determine the temporal changes during the epidemic with respect to the intervention strategy used. In this article, we adjust the daily cumulative epidemiological data for COVID-19 using the logistic growth model and the multi-wave Richards curve development model. The changepoint detection methods used include the interpolation method, the Pruned Exact Linear Time (PELT) method, and the Binary Segmentation (BS) method. The results of the analysis using 9 countries show that the Richards model development can be used to analyze multi-wave data using changepoint detection so that the initial data used for prediction on the last wave can be determined precisely. The changepoint used is the coincident changepoint generated by the PELT and BS methods. The interpolation method is only used to find out how many pandemic waves have occurred in given a country. Several waves have been identified and can better describe the data. Our results can find the peak of the pandemic and when it will end in each country, both for a single-wave pandemic and a multi-wave pandemic.","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48268326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of tree-based ensemble models for regression","authors":"Sangho Park, C. Kim","doi":"10.29220/csam.2022.29.5.561","DOIUrl":"https://doi.org/10.29220/csam.2022.29.5.561","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49385336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monitoring social networks based on transformation into categorical data","authors":"J. Lee, Jaeheon Lee","doi":"10.29220/csam.2022.29.4.487","DOIUrl":"https://doi.org/10.29220/csam.2022.29.4.487","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49422989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Kullback-Leibler divergence based comparison of approximate Bayesian estimations of ARMA models","authors":"A. Amin","doi":"10.29220/csam.2022.29.4.471","DOIUrl":"https://doi.org/10.29220/csam.2022.29.4.471","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41742335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ADMM for least square problems with pairwise-difference penalties for coefficient grouping","authors":"So-Hyun Parka, S. Shin","doi":"10.29220/csam.2022.29.4.441","DOIUrl":"https://doi.org/10.29220/csam.2022.29.4.441","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46616005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An R package UnifiedDoseFinding for continuous and ordinal outcomes in Phase I dose-finding trials","authors":"H. Pan, Rongji Mu, Chia-Wei Hsu, Shouhao Zhou","doi":"10.29220/csam.2022.29.4.421","DOIUrl":"https://doi.org/10.29220/csam.2022.29.4.421","url":null,"abstract":"","PeriodicalId":44931,"journal":{"name":"Communications for Statistical Applications and Methods","volume":" ","pages":""},"PeriodicalIF":0.4,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49194166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}