Kavya Pushadapu, Sarjinder Singh, Stephen A. Sedory
{"title":"An Optimised Optional Randomised Response Technique","authors":"Kavya Pushadapu, Sarjinder Singh, Stephen A. Sedory","doi":"10.1111/insr.12581","DOIUrl":"https://doi.org/10.1111/insr.12581","url":null,"abstract":"SummaryIn this paper, we begin by reviewing the optional randomised response technique estimator (ORRTE) developed by Chaudhuri and Mukerjee for estimating the proportion of a sensitive characteristic in a population. We show that their estimator is unbiased and has smaller variance than the Warner's estimator. Then we make an attempt at developing an optimised optional randomised response technique estimator (OORRTE). The proposed OORRTE is shown to be more efficient than the ORRTE. Findings from simulation studies are discussed and interpreted for various situations. Sample sizes for the Warner's estimator, the ORRTE and the OORRTE are computed based on power analysis introduced by Ulrich, Schroter, Striegel and Simon. Finally, we include an application to real data on COVID‐19 by considering it to be partially sensitive variable; that is, sensitive to some but not to others. The data used are included in the paper and the R‐codes used in the simulation study are documented in online material.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141188933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Robust Variable Selection of Mean and Covariance Model via Shrinkage Methods","authors":"Y. Güney, Fulya Gokalp Yavuz, Olcay Arslan","doi":"10.1111/insr.12577","DOIUrl":"https://doi.org/10.1111/insr.12577","url":null,"abstract":"A valuable and robust extension of the traditional joint mean and the covariance models when data subject to outliers and/or heavy‐tailed outcomes can be achieved using the joint modelling of location and scatter matrix of the multivariate t‐distribution. This model encompasses three models in itself, and the number of unknown parameters in the covariance model increases quadratically with the matrix size. As a result, selecting the important variables becomes a crucial aspect to consider. In this context, the variable selection combined with the parameter estimation is considered under the normality assumption. However, because of the non‐robustness of the normal distribution, the resulting estimators will be sensitive to outliers and/or heavy taildness in the data. This paper has two objectives to overcome these problems. The first is to obtain the maximum likelihood estimates of the parameters and propose an expectation‐maximisation type algorithm as an alternative to the Fisher scoring algorithm in the literature. We also consider simultaneous parameter estimation and variable selection in the multivariate t‐joint location and scatter matrix models. The consistency and oracle properties of the regularised estimators are also established. Simulation studies and real data analysis are provided to assess the performance of the proposed methods.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141099030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Llorente, Luca Martino, Jesse Read, David Delgado‐Gómez
{"title":"A Survey of Monte Carlo Methods for Noisy and Costly Densities With Application to Reinforcement Learning and ABC","authors":"Fernando Llorente, Luca Martino, Jesse Read, David Delgado‐Gómez","doi":"10.1111/insr.12573","DOIUrl":"https://doi.org/10.1111/insr.12573","url":null,"abstract":"SummaryThis survey gives an overview of Monte Carlo methodologies using surrogate models, for dealing with densities that are intractable, costly, and/or noisy. This type of problem can be found in numerous real‐world scenarios, including stochastic optimisation and reinforcement learning, where each evaluation of a density function may incur some computationally‐expensive or even physical (real‐world activity) cost, likely to give different results each time. The surrogate model does not incur this cost, but there are important trade‐offs and considerations involved in the choice and design of such methodologies. We classify the different methodologies into three main classes and describe specific instances of algorithms under a unified notation. A modular scheme that encompasses the considered methods is also presented. A range of application scenarios is discussed, with special attention to the likelihood‐free setting and reinforcement learning. Several numerical comparisons are also provided.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141062556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ODC and ROC Curves, Comparison Curves and Stochastic Dominance","authors":"Teresa Ledwina, Adam Zagdański","doi":"10.1111/insr.12571","DOIUrl":"10.1111/insr.12571","url":null,"abstract":"<div>\u0000 \u0000 <p>We discuss two novel approaches to inter-distributional comparisons in the classical two-sample problem. Our starting point is properly standardised and combined, very popular in several areas of statistics and data analysis, ordinal dominance and receiver characteristic curves, denoted by ODC and ROC, respectively. The proposed new curves are termed the comparison curves. Their estimates, being weighted rank processes on (0,1), form the basis of inference. These weighted processes are intuitive, well-suited for visual inspection of data at hand and are also useful for constructing some formal inferential procedures. They can be applied to several variants of two-sample problem. Their use can help improve some existing procedures both in terms of power and the ability to identify the sources of departures from the postulated model. To simplify interpretation of finite sample results, we restrict attention to values of the processes on a finite grid of points. This results in the so-called bar plots (B-plots), which readably summarise the information contained in the data. What is more, we show that B-plots along with adjusted simultaneous acceptance regions provide principled information about where the model departs from the data. This leads to a framework that facilitates identification of regions with locally significant differences.</p>\u0000 <p>We show an implementation of the considered techniques to a standard stochastic dominance testing problem. Some min-type statistics are introduced and investigated. A simulation study compares two tests pertinent to the comparison curves to well-established tests in the literature and demonstrates the strong and competitive performance of the former in many typical situations. Some real data applications illustrate simplicity and practical usefulness of the proposed approaches. A range of other applications of considered weighted processes is briefly discussed too.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140938951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Román Salmerón‐Gómez, Catalina B. García‐García, José García‐Pérez
{"title":"The Raise Regression: Justification, Properties and Application","authors":"Román Salmerón‐Gómez, Catalina B. García‐García, José García‐Pérez","doi":"10.1111/insr.12575","DOIUrl":"https://doi.org/10.1111/insr.12575","url":null,"abstract":"SummaryMulticollinearity results in inflation in the variance of the ordinary least squares estimators due to the correlation between two or more independent variables (including the constant term). A widely applied solution is to estimate with penalised estimators such as the ridge estimator, which trade off some bias in the estimators to gain a reduction in the variance of these estimators. Although the variance diminishes with these procedures, all seem to indicate that the inference and goodness of fit are controversial. Alternatively, the raise regression allows mitigation of the problems associated with multicollinearity without the loss of inference or the coefficient of determination. This paper completely formalises the raise estimator. For the first time, the norm of the estimator, the behaviour of the individual and joint significance, the behaviour of the mean squared error and the coefficient of variation are analysed. We also present the generalisation of the estimation and the relation between the raise and the residualisation estimators. To have a better understanding of raise regression, previous contributions are also summarised: its mean squared error, the variance inflation factor, the condition number, adequate selection of the variable to be raised, the successive raising, and the relation between the raise and the ridge estimator. The usefulness of the raise regression as an alternative to mitigate multicollinearity is illustrated with two empirical applications.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140837225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"One-Inflation and Zero-Truncation Count Data Modelling Revisited With a View on Horvitz–Thompson Estimation of Population Size","authors":"Dankmar Böhning, Herwig Friedl","doi":"10.1111/insr.12570","DOIUrl":"10.1111/insr.12570","url":null,"abstract":"<p>Estimating the size of a hard-to-count population is a challenging matter. We consider uni-list approaches in which the count of identifications per unit is the basis of analysis. Unseen units have a zero count and do not occur in the sample leading to a zero-truncated setting. Because of various mechanisms, one-inflation is often an occurring phenomena that can lead to seriously biased estimates of population size. The current work reviews some recent advances on one-inflation and zero-truncation modelling, and furthermore focuses here on the impact it has on population size estimation. The zero-truncated one-inflated and the one-inflated zero-truncated model is compared (also with the model ignoring one-inflation) in terms of Horvitz–Thompson estimation of population size. The simulation work shows clearly the biasing effect of ignoring one-inflation. Both models, the zero-truncated one-inflated and the one-inflated zero-truncated one, are suitable to model ongoing one-inflation. It is also important to choose an appropriate base-line distributional model. Finally, all models derived in the paper are illustrated on a number of case studies.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140837218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Small Sample Inference for Two‐Way Capture‐Recapture Experiments","authors":"Louis‐Paul Rivest, Mamadou Yauck","doi":"10.1111/insr.12574","DOIUrl":"https://doi.org/10.1111/insr.12574","url":null,"abstract":"SummaryThe properties of the generalised Waring distribution defined on the non‐negative integers are reviewed. Formulas for its moments and its mode are given. A construction as a mixture of negative binomial distributions is also presented. Then we turn to the Petersen model for estimating the population size in a two‐way capture‐recapture experiment. We construct a Bayesian model for by combining a Waring prior with the hypergeometric distribution for the number of units caught twice in the experiment. Credible intervals for are obtained using quantiles of the posterior, a generalised Waring distribution. The standard confidence interval for the population size constructed using the asymptotic variance of Petersen estimator and 0.5 logit transformed interval are shown to be special cases of the generalised Waring credible interval. The true coverage of this interval is shown to be bigger than or equal to its nominal converage in small populations, regardless of the capture probabilities. In addition, its length is substantially smaller than that of the 0.5 logit transformed interval. Thus, the proposed generalised Waring credible interval appears to be the best way to quantify the uncertainty of the Petersen estimator for populations size.","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140798380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yajuan Si, Roderick J.A. Little, Ya Mo, Nell Sedransk
{"title":"Nonresponse Bias Analysis in Longitudinal Studies: A Comparative Review with an Application to the Early Childhood Longitudinal Study","authors":"Yajuan Si, Roderick J.A. Little, Ya Mo, Nell Sedransk","doi":"10.1111/insr.12566","DOIUrl":"10.1111/insr.12566","url":null,"abstract":"<p>Longitudinal studies are subject to nonresponse when individuals fail to provide data for entire waves or particular questions of the survey. We compare approaches to nonresponse bias analysis (NRBA) in longitudinal studies and illustrate them on the Early Childhood Longitudinal Study, Kindergarten Class of 2010–2011 (ECLS-K:2011). Wave nonresponse with attrition often yields a monotone missingness pattern, and the missingness mechanism can be missing at random (MAR) or missing not at random (MNAR). We discuss weighting, multiple imputation (MI), incomplete data modelling and Bayesian approaches to NRBA for monotone patterns. Weighting adjustments can be effective when the constructed weights are correlated with the survey outcome of interest. MI allows for variables with missing values to be included in the imputation model, yielding potentially less biased and more efficient estimates. We add offsets in the MAR results to provide sensitivity analyses to assess MNAR deviations. We conduct NRBA for descriptive summaries and analytic model estimates in the ECLS-K:2011 application. The strength of evidence about our NRBA depends on the strength of the relationship between the fully observed variables and the key survey outcomes, so the key to a successful NRBA is to include strong predictors.</p>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/insr.12566","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140169849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multidimensional Stationary Time Series Dimension Reduction and Prediction Marianna Bolla, Tamás SzabadosRoutledge, 2023, xiv + 318 pages, $59.95, paperback ISBN: 9780367619701","authors":"Brian W. Sloboda","doi":"10.1111/insr.12567","DOIUrl":"10.1111/insr.12567","url":null,"abstract":"","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":2.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140115075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Slicing-Free Perspective to Sufficient Dimension Reduction: Selective Review and Recent Developments","authors":"Lu Li, Xiaofeng Shao, Zhou Yu","doi":"10.1111/insr.12565","DOIUrl":"10.1111/insr.12565","url":null,"abstract":"<div>\u0000 \u0000 <p>Since the pioneering work of sliced inverse regression, sufficient dimension reduction has been growing into a mature field in statistics and it has broad applications to regression diagnostics, data visualisation, image processing and machine learning. In this paper, we provide a review of several popular inverse regression methods, including sliced inverse regression (SIR) method and principal hessian directions (PHD) method. In addition, we adopt a conditional characteristic function approach and develop a new class of slicing-free methods, which are parallel to the classical SIR and PHD, and are named weighted inverse regression ensemble (WIRE) and weighted PHD (WPHD), respectively. Relationship with recently developed martingale difference divergence matrix is also revealed. Numerical studies and a real data example show that the proposed slicing-free alternatives have superior performance than SIR and PHD.</p>\u0000 </div>","PeriodicalId":14479,"journal":{"name":"International Statistical Review","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140073149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}