{"title":"Equivalence testing for multiple groups","authors":"Tony Pourmohamad, Herbert K. H. Lee","doi":"10.1002/sta4.645","DOIUrl":"https://doi.org/10.1002/sta4.645","url":null,"abstract":"Testing for equivalence, rather than testing for a difference, is an important component of some scientific studies. While the focus of the existing literature is on comparing two groups for equivalence, real-world applications arise regularly that require testing across more than two groups. This paper reviews the existing approaches for testing across multiple groups and proposes a novel framework for multigroup equivalence testing under a Bayesian paradigm. This approach allows for a more scientifically meaningful definition of the equivalence margin and a more powerful test than the few existing alternatives. This approach also allows a new definition of equivalence based on future differences.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"3 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139460065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iterative estimating equations for disease mapping with spatial zero‐inflated Poisson data","authors":"Pei-Sheng Lin, Jun Zhu, Feng‐Chang Lin","doi":"10.1002/sta4.646","DOIUrl":"https://doi.org/10.1002/sta4.646","url":null,"abstract":"Spatial epidemiology often involves the analysis of spatial count data with an unusually high proportion of zero observations. While Bayesian hierarchical models perform very well for zero‐inflated data in many situations, a smooth response surface is usually required for the Bayesian methods to converge. However, for infectious disease data with excessive zeros, a Wombling issue with large spatial variation could make the Bayesian methods infeasible. To address this issue, we develop estimating equations associated with disease mapping by including over‐dispersion and spatial noises in a spatial zero‐inflated Poisson model. Asymptotic properties are derived for the parameter estimates. Simulations and data analysis are used to assess and illustrate the proposed method.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"91 25","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139454718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Significance of modes in the torus by topological data analysis","authors":"Changjo Yu, Sungkyu Jung, Jisu Kim","doi":"10.1002/sta4.636","DOIUrl":"https://doi.org/10.1002/sta4.636","url":null,"abstract":"This paper addresses the problem of identifying modes or density bumps in multivariate angular or circular data, which have diverse applications in fields like medicine, biology and physics. We focus on the use of topological data analysis and persistent homology for this task. Specifically, we extend the methods for uncertainty quantification in the context of a torus sample space, where circular data lie. To achieve this, we employ two types of density estimators, namely, the von Mises kernel density estimator and the von Mises mixture model, to compute persistent homology, and propose a scale-space view for searching significant bumps in the density. The results of bump hunting are summarised and visualised through a scale-space diagram. Our approach using the mixture model for persistent homology offers advantages over conventional methods, allowing for dendrogram visualisation of components and identification of mode locations. For testing whether a detected mode is really there, we propose several inference tools based on bootstrap resampling and concentration inequalities, establishing their theoretical applicability. Experimental results on SARS-CoV-2 spike glycoprotein torsion angle data demonstrate the effectiveness of our proposed methods in practice.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"20 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138717531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An asymptotically efficient closed-form estimator for the Dirichlet distribution","authors":"Jae Ho Chang, Sang Kyu Lee, Hyoung-Moon Kim","doi":"10.1002/sta4.640","DOIUrl":"https://doi.org/10.1002/sta4.640","url":null,"abstract":"Maximum likelihood estimator (MLE) of the Dirichlet distribution is usually obtained by using the Newton–Raphson algorithm. However, in some cases, the computational costs can be burdensome, for example, in real-time processes. Therefore, it is beneficial to develop a closed-form estimator that is as efficient as the MLE for large sample. Here, we suggest asymptotically efficient closed-form estimator based on the classical large sample theory.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"82 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138629302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaoqing Tian, Peng Wu, Zixin Yang, Dingjiao Cai, Qirui Hu
{"title":"Robust nonparametric estimation of average treatment effects: A propensity score-based varying coefficient approach","authors":"Zhaoqing Tian, Peng Wu, Zixin Yang, Dingjiao Cai, Qirui Hu","doi":"10.1002/sta4.637","DOIUrl":"https://doi.org/10.1002/sta4.637","url":null,"abstract":"We present a novel nonparametric approach for estimating average treatment effects (ATEs), addressing a fundamental challenge in causal inference research, both in theory and empirical studies. Our method offers an effective solution to mitigate the instability problem caused by propensity scores close to zero or one, which are commonly encountered in (augmented) inverse probability weighting approaches. Notably, our method is straightforward to implement and does not depend on outcome model specification. We introduce an estimator for ATE and establish its consistency and asymptotic normality through rigorous analysis. To demonstrate the robustness of our method against extreme propensity scores, we conduct an extensive simulation study. Additionally, we apply our proposed methods to estimate the impact of social activity disengagement on cognitive ability using a nationally representative cohort study. Furthermore, we extend our proposed method to estimate the ATE on the treated population.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"286 1 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138629207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Observation-driven exponential smoothing","authors":"Dimitris Karlis, Xanthi Pedeli, Cristiano Varin","doi":"10.1002/sta4.642","DOIUrl":"https://doi.org/10.1002/sta4.642","url":null,"abstract":"This article presents an approach to forecasting count time series with a form of exponential smoothing built from observation-driven models. The proposed method is easy to implement and simple to interpret. A variant of the approach is also proposed to handle the impact of outliers on the forecast. The performance of the methodology is studied with simulations and illustrated with an analysis of the number of monthly cases of dengue fever observed in Italy for the years 2008–2021. An <span style=\"font-family:monospace\">R</span> package is made available to enable the reader to reproduce the results discussed in the article.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"1 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138548194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Shilane, Nicole L. Lorenzetti, David K. Kruetter
{"title":"A comparative analysis of contractual risks in statistical consulting","authors":"David Shilane, Nicole L. Lorenzetti, David K. Kruetter","doi":"10.1002/sta4.639","DOIUrl":"https://doi.org/10.1002/sta4.639","url":null,"abstract":"This study enumerates and compares the risks and rewards of different forms of statistical consulting contracts. We assess three different contract models: project-based fees, hourly fees, and retainer agreements and three different planned durations: project-based, time-based, and evergreen contracts. The requirements of time and effort vary considerably for many aspects of consulting work. The risks of statistical consulting contracts include both the general risks of consulting projects along with the specialized risks of statistical investigations. We enumerate a number of general risks in the categories of unanticipated developments, revisions and collaboration, and changing scopes of projects. Meanwhile, the specialized statistical risks include issues of study design, data quality, statistical investigation, and communication of statistical issues. Because of these concerns, the specialized risks of statistical investigations add considerably to the general risks of consulting projects. Moreover, these issues can be exacerbated or mitigated by the form of the consulting agreement. With a greater understanding of the risks and benefits of each type of contract, statistical consultants and clients can negotiate more mutually beneficial contracts for either or both parties. Through this discussion, we hope to raise awareness of these issues and help to create working conditions with a greater likelihood of a successful project for both statistical consultants and their clients.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"16 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138548108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of the ROC curve and the area under it with complex survey data","authors":"Amaia Iparragirre, Irantzu Barrio, Inmaculada Arostegui","doi":"10.1002/sta4.635","DOIUrl":"https://doi.org/10.1002/sta4.635","url":null,"abstract":"Logistic regression models are widely applied in daily practice. Hence, it is necessary to ensure they have an adequate predictive performance, which is usually estimated by means of the receiver operating characteristic (ROC) curve and the area under it (area under the curve [AUC]). Traditional estimators of these parameters are thought to be applied to simple random samples but are not appropriate for complex survey data. The goal of this work is to propose new weighted estimators for the ROC curve and AUC based on sampling weights which, in the context of complex survey data, indicate the number of units that each sampled observation represents in the population. The behaviour of the proposed estimators is evaluated and compared with the traditional unweighted ones by means of a simulation study. Finally, weighted and unweighted ROC curve and AUC estimators are applied to real survey data in order to compare the estimates in a real scenario. The results suggest the use of the weighted estimators proposed in this work in order to obtain unbiassed estimates for the ROC curve and AUC of logistic regression models fitted to complex survey data.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"24 8","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-degenerate U-statistics for data missing completely at random with application to testing independence","authors":"Danijel Aleksić, Marija Cuparić, Bojana Milošević","doi":"10.1002/sta4.634","DOIUrl":"https://doi.org/10.1002/sta4.634","url":null,"abstract":"Although the era of digitalization has enabled access to large quantities of data, due to their insufficient structuring, some data are often missing, and sometimes, the percentage of missing data is significant compared to the entire sample. On the other hand, most of the statistical methodology is designed for complete data. Here, we explore the asymptotic properties of non-degenerate <i>U</i>-statistics when the data are missing completely at random and a complete-case approach is utilized. The obtained results are applied to the estimator of Kendall's <math altimg=\"urn:x-wiley:sta4:media:sta4634:sta4634-math-0001\" display=\"inline\" location=\"graphic/sta4634-math-0001.png\">\u0000<mi>t</mi>\u0000<mi>a</mi>\u0000<mi>u</mi></math> used for testing independence. In this context, the median-based imputation approach is also considered, and asymptotic properties are explored. In addition, both complete-case and median imputation approaches are compared in an extensive Monte Carlo study.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"368 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ick Hoon Jin, Jonghyun Yun, Hyunjoo Kim, Minjeong Jeon
{"title":"A latent space accumulator model for response time: Applications to cognitive assessment data","authors":"Ick Hoon Jin, Jonghyun Yun, Hyunjoo Kim, Minjeong Jeon","doi":"10.1002/sta4.632","DOIUrl":"https://doi.org/10.1002/sta4.632","url":null,"abstract":"Response time has attracted increased interest in educational and psychological assessment for, for example, measuring test takers' processing speed, improving the measurement accuracy of ability and understanding aberrant response behaviour. Most models for response time analysis are based on a parametric assumption about the response time distribution. The Cox proportional hazard model has been utilized for response time analysis for the advantages of not requiring a distributional assumption of response time and enabling meaningful interpretations with respect to response processes. In this paper, we present a new version of the proportional hazard model, called a latent space accumulator model, for cognitive assessment data based on accumulators for two competing response outcomes, such as correct versus incorrect responses. The proposed model extends a previous accumulator model by capturing dependencies between respondents and test items across accumulators in the form of distances in a two-dimensional Euclidean space. A fully Bayesian approach is developed to estimate the proposed model. The utilities of the proposed model are illustrated with two real data examples.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"10 8","pages":""},"PeriodicalIF":1.7,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}