{"title":"An exact projection pursuit-based algorithm for multivariate two-sample nonparametric testing applicable to retrospective and group sequential studies","authors":"Li Zou, Gregory Gurevich, Ablert Vexler","doi":"10.1080/02664763.2023.2277118","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277118","url":null,"abstract":"AbstractNonparametric tests for equality of multivariate distributions are frequently desired in research. It is commonly required that test-procedures based on relatively small samples of vectors accurately control the corresponding Type I Error (TIE) rates. Often, in the multivariate testing, extensions of null-distribution-free univariate methods, e.g., Kolmogorov-Smirnov and Cramér-von Mises type schemes, are not exact, since their null distributions depend on underlying data distributions. The present paper extends the density-based empirical likelihood technique in order to nonparametrically approximate the most powerful test for the multivariate two-sample (MTS) problem, yielding an exact finite-sample test statistic. We rigorously apply one-to-one-mapping between the equality of vectors' distributions and the equality of distributions of relevant univariate linear projections. We establish a general algorithm that simplifies the use of projection pursuit, employing only a few of the infinitely many linear combinations of observed vectors' components. The displayed distribution-free strategy is employed in retrospective and group sequential manners. A novel MTS nonparametric procedure in the group sequential manner is proposed. The asymptotic consistency of the proposed technique is shown. Monte Carlo studies demonstrate that the proposed procedures exhibit extremely high and stable power characteristics across a variety of settings. Supplementary materials for this article are available online.KEYWORDS: Density-based empirical likelihoodexact testmultivariate two-sample testnonparametric testprojection pursuit AcknowledgementWe are grateful to the Editor, the AE and two reviewers for helpful comments.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"2007 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135636034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust estimation and bias-corrected empirical likelihood in generalized linear models with right censored data","authors":"Liugen Xue, Junshan Xie, Xiaohui Yang","doi":"10.1080/02664763.2023.2277117","DOIUrl":"https://doi.org/10.1080/02664763.2023.2277117","url":null,"abstract":"AbstractIn this paper, we study the robust estimation and empirical likelihood for the regression parameter in generalized linear models with right censored data. A robust estimating equation is proposed to estimate the regression parameter, and the resulting estimator has consistent and asymptotic normality. A bias-corrected empirical log-likelihood ratio statistic of the regression parameter is constructed, and it is shown that the statistic converges weakly to a standard χ2 distribution. The result can be directly used to construct the confidence region of regression parameter. We use the bias correction method to directly calibrate the empirical log-likelihood ratio, which does not need to be multiplied by an adjustment factor. We also propose a method for selecting the tuning parameters in the loss function. Simulation studies show that the estimator of the regression parameter is robust and the bias-corrected empirical likelihood is better than the normal approximation method. An example of a real dataset from Alzheimer's disease studies shows that the proposed method can be applied in practical problems.Keywords: Generalized linear modelright censored datarobust estimationempirical likelihoodregression parameter AcknowledgmentsThe authors thank the Editor, Associate Editor and two referees for their helpful comments. The dataset used was provided by Dr. Chunling Liu of the Hong Kong Polytechnic University. The source of this dataset is available on https://adni.loni.usc.edu/about/.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThe research were supported by the National Natural Science Foundation of China (11971001), the Natural Science Foundation of Henan (222300420417), and the Science and Technology Project (2103004).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"47 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of time-varying kernel densities and chronology of the impact of COVID-19 on financial markets","authors":"Matthieu Garcin, Jules Klein, Sana Laaribi","doi":"10.1080/02664763.2023.2272226","DOIUrl":"https://doi.org/10.1080/02664763.2023.2272226","url":null,"abstract":"The time-varying kernel density estimation relies on two free parameters: the bandwidth and the discount factor. We propose to select these parameters so as to minimize a criterion consistent with the traditional requirements of the validation of a probability density forecast. These requirements are both the uniformity and the independence of the so-called probability integral transforms, which are the forecast time-varying cumulated distributions applied to the observations. We thus build a new numerical criterion incorporating both the uniformity and independence properties by the mean of an adapted Kolmogorov-Smirnov statistic. We apply this method to financial markets during the COVID-19 crisis. We determine the time-varying density of daily price returns of several stock indices and, using various divergence statistics, we are able to describe the chronology of the crisis as well as regional disparities. For instance, we observe a more limited impact of COVID-19 on financial markets in China, a strong impact in the US, and a slow recovery in Europe.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"326 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136018115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A longitudinal study of the influence of air pollutants on children: a robust multivariate approach","authors":"Ian Meneghel Danilevicz, Pascal Bondon, Valdério Anselmo Reisen, Faradiba Sarquis Serpa","doi":"10.1080/02664763.2023.2272228","DOIUrl":"https://doi.org/10.1080/02664763.2023.2272228","url":null,"abstract":"","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"6 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136104508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving the Hosmer-Lemeshow goodness-of-fit test in large models with replicated Bernoulli trials","authors":"Nikola Surjanovic, Thomas M. Loughin","doi":"10.1080/02664763.2023.2272223","DOIUrl":"https://doi.org/10.1080/02664763.2023.2272223","url":null,"abstract":"The Hosmer-Lemeshow (HL) test is a commonly used global goodness-of-fit (GOF) test that assesses the quality of the overall fit of a logistic regression model. In this paper, we give results from simulations showing that the type I error rate (and hence power) of the HL test decreases as model complexity grows, provided that the sample size remains fixed and binary replicates (multiple Bernoulli trials) are present in the data. We demonstrate that a generalized version of the HL test (GHL) presented in previous work can offer some protection against this power loss. These results are also supported by application of both the HL and GHL test to a real-life data set. We conclude with a brief discussion explaining the behavior of the HL test, along with some guidance on how to choose between the two tests. In particular, we suggest the GHL test to be used when there are binary replicates or clusters in the covariate space, provided that the sample size is sufficiently large.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"43 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136261853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yajun Mei, Jay Bartroff, Jie Chen, Georgios Fellouris, Ruizhi Zhang
{"title":"Editorial to the special issue: modern streaming data analytics.","authors":"Yajun Mei, Jay Bartroff, Jie Chen, Georgios Fellouris, Ruizhi Zhang","doi":"10.1080/02664763.2023.2247646","DOIUrl":"10.1080/02664763.2023.2247646","url":null,"abstract":"","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"50 14","pages":"2857-2861"},"PeriodicalIF":1.2,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10557558/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41101725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The zero-and-plus/minus-one inflated extended-Poisson distribution","authors":"Maher Kachour, Christophe Chesneau","doi":"10.1080/02664763.2023.2260570","DOIUrl":"https://doi.org/10.1080/02664763.2023.2260570","url":null,"abstract":"ABSTRACTIn this paper, we introduce a new distribution defined on Z, called the ZPMOIEP distribution, which can be viewed as a natural extension of the zero-and-one-inflated Poisson (ZOIP) distribution. It is designed to fit the count data with potentially excess zeros and/or ones, and/or minus ones. We explore its various properties and investigate the estimation of the unknown parameters. Moreover, simulation experiments are carried out to attest to the performance of the estimation. Through the use of a useful data set on football scores, the applicability of the proposed distribution is examined.KEYWORDS: Zero-and-one-inflated Poisson distributiondiscrete distribution defined on Zextended Poisson distributionsimulationcount data analysis Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135579230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian transformation model for spatial partly interval-censored data","authors":"Mingyue Qiu, Tao Hu","doi":"10.1080/02664763.2023.2263819","DOIUrl":"https://doi.org/10.1080/02664763.2023.2263819","url":null,"abstract":"AbstractThe transformation model with partly interval-censored data offers a highly flexible modeling framework that can simultaneously support multiple common survival models and a wide variety of censored data types. However, the real data may contain unexplained heterogeneity that cannot be entirely explained by covariates and may be brought on by a variety of unmeasured regional characteristics. Due to this, we introduce the conditionally autoregressive prior into the transformation model with partly interval-censored data and take the spatial frailty into account. An efficient Markov chain Monte Carlo method is proposed to handle the posterior sampling and model inference. The approach is simple to use and does not include any challenging Metropolis steps owing to four-stage data augmentation. Through several simulations, the suggested method's empirical performance is assessed and then the method is used in a leukemia study.Keywords: Data augmentationMCMC methodpartly interval-censored dataspatial effectsemiparametric transformation model AcknowledgmentsThe authors wish to thank the Editor, the Associate Editor and two reviewers for their many helpful and insightful comments and suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research partially supported by the Beijing Natural Science Foundation [grant number Z210003] and National Nature Science Foundation of China [grant numbers 12171328 and 11971064].","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135580017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating linear mixed effect models with non-normal random effects through saddlepoint approximation and its application in retail pricing analytics","authors":"Hao Chen, Lanshan Han, Alvin Lim","doi":"10.1080/02664763.2023.2260576","DOIUrl":"https://doi.org/10.1080/02664763.2023.2260576","url":null,"abstract":"ABSTRACTLinear Mixed Effects (LME) models are powerful statistical tools that have been employed in many different real-world applications such as retail data analytics, marketing measurement, and medical research. Statistical inference is often conducted via maximum likelihood estimation with Normality assumptions on the random effects. Nevertheless, for many applications in the retail industry, it is often necessary to consider non-Normal distributions on the random effects when considering the unknown parameters' business interpretations. Motivated by this need, a linear mixed effects model with possibly non-Normal distribution is studied in this research. We propose a general estimating framework based on a saddlepoint approximation (SA) of the probability density function of the dependent variable, which leads to constrained nonlinear optimization problems. The classical LME model with Normality assumption can then be viewed as a special case under the proposed general SA framework. Compared with the existing approach, the proposed method enhances the real-world interpretability of the estimates with satisfactory model fits.KEYWORDS: Mixed effects modellinear regressionconstrained optimizationstatistical inferencesaddlepoint approximationMATHEMATICAL SUBJECT CLASSIFICATION: 62J05 Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135925496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sedigheh Omidvar, Mohammad Jafari Jozani, Nader Nematollahi, Wiliam D. Leslie
{"title":"Estimating the prevalence of osteoporosis using ranked-based methodologies and Manitoba's population-based BMD registry","authors":"Sedigheh Omidvar, Mohammad Jafari Jozani, Nader Nematollahi, Wiliam D. Leslie","doi":"10.1080/02664763.2023.2260572","DOIUrl":"https://doi.org/10.1080/02664763.2023.2260572","url":null,"abstract":"AbstractOsteoporosis is a metabolic bone disorder that is characterized by reduced bone mineral density (BMD) and deterioration of bone microarchitecture. Osteoporosis is highly prevalent among women over 50, leading to skeletal fragility and risk of fracture. Early diagnosis and treatment of those at high risk for fracture is very important in order to avoid morbidity, mortality and economic burden from preventable fractures. The province of Manitoba established a BMD testing program in 1997. The Manitoba BMD registry is now the largest population-based BMD registry in the world, and has detailed information on fracture outcomes and other covariates for over 160,000 BMD assessments. In this paper, we develop a number of methodologies based on ranked-set type sampling designs to estimate the prevalence of osteoporosis among women of age 50 and older in the province of Manitoba. We use a parametric approach based on finite mixture models, as well as the usual approaches using simple random and stratified sampling designs. Results are obtained under perfect and imperfect ranking scenarios while the sampling and ranking costs are incorporated into the study. We observe that rank-based methodologies can be used as cost-efficient methods to monitor the prevalence of osteoporosis.Keywords: Bone mineral densityEM algorithmfinite mixture modelosteoporosisstratified samplingunbalanced ranked set sampling Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingMohammad Jafari Jozani gratefully acknowledges the research support of the Natural Sciences and Engineering Research Council of Canada (NSERC). We express our gratitude to two anonymous reviewers and an associate editor for their valuable and constructive comments","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135925953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}