{"title":"Post-shrinkage strategies in statistical and machine learning for high dimensional data","authors":"","doi":"10.1080/02664763.2023.2286426","DOIUrl":"https://doi.org/10.1080/02664763.2023.2286426","url":null,"abstract":"Published in Journal of Applied Statistics (Ahead of Print, 2023)","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"8 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A characteristic function based circular distribution family and its goodness of fit : The flexible wrapped Linnik family","authors":"Ashis SenGupta, Moumita Roy","doi":"10.1080/02664763.2023.2283689","DOIUrl":"https://doi.org/10.1080/02664763.2023.2283689","url":null,"abstract":"In this article, the primary aim is to introduce a new flexible family of circular distributions, namely the wrapped Linnik family which possesses the flexibility to model the inflection points and...","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"48 10","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Determination of the number of clusters through logistic regression analysis","authors":"Soumita Modak","doi":"10.1080/02664763.2023.2283687","DOIUrl":"https://doi.org/10.1080/02664763.2023.2283687","url":null,"abstract":"We advise a novel measure to determine the unknown number of clusters underlying a designated sample through implementation of the parametric logistic regression model. The regression analysis is c...","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"47 19","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin W. Langworthy, Zhaoxun Hou, Gary C. Curhan, Sharon G. Curhan, Molin Wang
{"title":"Estimating intracluster correlation for ordinal data","authors":"Benjamin W. Langworthy, Zhaoxun Hou, Gary C. Curhan, Sharon G. Curhan, Molin Wang","doi":"10.1080/02664763.2023.2280821","DOIUrl":"https://doi.org/10.1080/02664763.2023.2280821","url":null,"abstract":"In this paper, we consider the estimation of intracluster correlation for ordinal data. We focus on pure-tone audiometry hearing threshold data, where thresholds are measured in 5 decibel increment...","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"151 ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deniz A. Sanchez S., Rubén D. Guevara G., Sergio A. Calderón V.
{"title":"Comparison of two statistical methodologies for a binary classification problem of two-dimensional images","authors":"Deniz A. Sanchez S., Rubén D. Guevara G., Sergio A. Calderón V.","doi":"10.1080/02664763.2023.2279012","DOIUrl":"https://doi.org/10.1080/02664763.2023.2279012","url":null,"abstract":"The present work intends to compare two statistical classification methods using images as covariates and under the comparison criterion of the ROC curve. The first implemented procedure is based o...","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"48 6","pages":""},"PeriodicalIF":1.5,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138514116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GWR-assisted integrated estimator of finite population total under two-phase sampling: a model-assisted approach","authors":"Nobin Chandra Paul, Anil Rai, Tauqueer Ahmad, Ankur Biswas, Prachi Misra Sahoo","doi":"10.1080/02664763.2023.2280879","DOIUrl":"https://doi.org/10.1080/02664763.2023.2280879","url":null,"abstract":"AbstractIn survey sampling, auxiliary information is used to precisely estimate the finite population parameters. There are several approaches available in the literature that provide a practical method for incorporating auxiliary information during the estimation stage. In order to effectively utilize the auxiliary information, a geographically weighted regression (GWR) model-assisted integrated estimator of finite population total under a two-phase sampling design has been proposed in this article. Spatial simulation studies have been conducted to empirically assess the statistical properties of the proposed estimator. In the presence of spatial non-stationarity, empirical findings reveal that the proposed estimator outperforms all existing estimators such as two-phase HT, ratio, and regression estimators, demonstrating the importance of spatial information in survey sampling.KEYWORDS: Data integrationgeographically weighted regressionmodel-assisted approachspatial non-stationaritytwo-phase regression AcknowledgementThe authors are thankful to the blind reviewers for providing valuable suggestions that have greatly enhanced the quality of the article. The first author would like to express his heartfelt gratitude to the ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India, for providing the real CCE survey data, lab facilities, and overall support to conduct the research work during Ph.D. programme.Disclosure statementNo potential conflict of interest was reported by the authors.Data availability statementData sharing is not applicable.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"15 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134900822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting of the true satellite carbon monoxide data with ensemble empirical mode decomposition, singular value decomposition and moving average","authors":"Sameer Poongadan, M. C. Lineesh","doi":"10.1080/02664763.2023.2277115","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277115","url":null,"abstract":"AbstractThe forecasting of carbon monoxide in the atmosphere is essential as it causes the pollution of the atmosphere and hence severe health problems for humans. This study proposes a time-series prognosis EEMD-SVD-MA technique which incorporates Ensemble Empirical Mode Decomposition, Singular Value Decomposition and Moving Average, to predict the prospects of carbon monoxide data taken from the Indian region. The collected data are non-linear. The technique can be applied for non-stationary and non-linear data. In this approach, there are three levels: EEMD level, SVD level and MA level. The first level deploys EEMD to fragment data series into a limited number of Intrinsic Mode Function (IMF) components along with a residue. To denoise each IMF component, SVD is deployed in the second level. In the third level, each denoised IMF component is predicted by MA. The future values of the original data are obtained by adding all the predicted series of the components. In this study, we proposed two variants of the model: EEMD-SVD-MA(3) and EEMD-SVD-MA(4) and compared the results with other forecasting techniques, namely LSTM (Long Short Term Memory network), EMD-LSTM, EMD-MA, EEMD-MA and CEEMDAN-MA. The results show that the proposed EEMD-SVD-MA model is more efficient than other models.Keywords: Intrinsic mode functionempirical mode decompositionensemble empirical mode decompositionsingular value decompositionmoving averagelong short term memory networkMathematics Subject Classifications: 37M1068T0715A18 AcknowledgmentsThe author's deep appreciation goes out to NASA's teams for AIRS/AMSU, MODIS and MOPPIT data for tropospheric CO.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"29 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134900804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phase II control charts for monitoring the depth-ratio of ball-bearings involving three normal variables","authors":"Li Jin, Amitava Mukherjee, Zhi Song, Jiujun Zhang","doi":"10.1080/02664763.2023.2279015","DOIUrl":"https://doi.org/10.1080/02664763.2023.2279015","url":null,"abstract":"AbstractThis paper investigates the problem of monitoring the ratio involving three variables, jointly distributed as trivariate normal. The Shewhart-type and two exponentially weighted moving average (EWMA) type schemes for monitoring depth ratio are proposed. The ratio of a normal variable to the average of two other normal variables has wide applications in natural science, production, and engineering. It is defined with slightly different terminology in various contexts, such as depth or aspect ratios. In modern bearing manufacturing, the aspect ratio of width to the average of inner and outer diameters can be an essential indicator of product quality and process stability. While there are many helpful existing charts for monitoring the three components separately or jointly when these characteristics follow a normal distribution, the ratio aspect is often ignored. The Shewhart-type schemes' exact and approximated control limits are considered and analyzed. Numerical results based on Monte-Carlo are conducted using the average run length as a metric with different values of in-control ratio and correlation between the three variables. An application based on the parts manufacturing data illustrates the implementation design of the two control charts. The real-life data analysis shows the efficacy of the proposed monitoring schemes in practice.Keywords: Charting schemesparts manufacturingratio involving three variablesphase-II process monitoringtrivariate normal AcknowledgmentsThe authors are grateful to the Editor-in-chief, Associate Editor and three reviewers for various constructive comments and suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 https://products.emersonbearing.com/viewitems/deep-groove-radial-ball-bearings/6300-series-deep-groove-radial-ball-bearingsAdditional informationFundingThis work was supported by the National Natural Science Foundation of China [Grant Nos. 12171328,12201429]; the Beijing Natural Science Foundation [Grant No. Z210003]; the Liaoning BaiQianWan Talents Program; the Natural Science Foundation of Liaoning Province [Grant Nos. 2020-MS-139, 2023-MS-142]; the Scientific Research Fund of Liaoning Provincial Education Department of China [Grant No. LJC202006]; the Project of Science and Research of Hebei Educational Department of China [Grant No. ZD2022020]; the Doctoral Research Start-up Fund of Liaoning Province [Grant No. 2021-BS-142]; the Research on Humanities and Social Sciences of the Ministry of Education [Grant No. 22YJC910009]; and the Research of economic and social development in Liaoning Province [Grant No. 20231s1ybkt-103].","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135341864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rosa M. García-Fernández, Federico Palacios-González
{"title":"Smoothing level selection for density estimators based on the moments","authors":"Rosa M. García-Fernández, Federico Palacios-González","doi":"10.1080/02664763.2023.2277125","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277125","url":null,"abstract":"AbstractThis paper introduces an approach to select the bandwidth or smoothing parameter in multiresolution (MR) density estimation and nonparametric density estimation. It is based on the evolution of the second, third and fourth central moments and the shape of the estimated densities for different bandwidths and resolution levels. The proposed method has been applied to density estimation by means of multiresolution densities as well as kernel density estimation (MRDE and KDE respectively). The results of the simulations and the empirical application demonstrate that the level of resolution resulting from the moments method performs better with multimodal densities than the Bayesian Information Criterion (BIC) for multiresolution densities estimation and the plug-in for kernel densities estimation.KEYWORDS: Multiresolution density estimationkernel density estimationbandwidthmoments and level of resolution Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The multiresolution densities are a particular case of semiparametric models (see, [Citation12,Citation14]).2 This is a well-known fact underlying all the bandwidth selection methods.3 Remind that these intervals form a partition of the real line and their amplitude converges to zero as j increases.4 Unless this is done parametrically using the EM algorithm on a mixture model of three double exponential distributions. But for a sample of size 10,000 the process time is too long.5 Note that the values for the Gini coefficient can differ from other publications since our illustration is based on gross income instead of net income.6 The expected value of the density is zero and the central and non-central moments are equal.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"312 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135474671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A computationally efficient sequential regression imputation algorithm for multilevel data","authors":"Tugba Akkaya Hocagil, Recai M. Yucel","doi":"10.1080/02664763.2023.2277669","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277669","url":null,"abstract":"ABSTRACTDue to the computational burden, especially in high-dimensional settings, sequential imputation may not be practical. In this paper, we adopt computationally advantageous methods by sampling the missing data from their perspective predictive distributions, which leads to significantly improved computation time in the class of variable-by-variable imputation algorithms. We assess the computational performance in a comprehensive simulation study. We then compare and contrast the performance of our algorithm with commonly used alternatives. The results show that our method has a significant advantage over the commonly used alternatives with respect to computational efficiency and inferential quality. Finally, we demonstrate our methods in a substantive problem aimed at investigating the effects of area-level behavioral, socioeconomic, and demographic characteristics on poor birth outcomes in New York State among singleton births.KEYWORDS: Sequential regression imputationmultilevel datacomputational efficiencyfast variable by variable imputationmultiple imputation by chained equations AcknowledgmentsWe thank Dr. Tabassum Insaf for providing assistance in accessing the New York State Vital Records Registry data.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"2017 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135635744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}