{"title":"Statistical inference for stationary linear models with tapered data","authors":"M. Ginovyan, A. A. Sahakyan","doi":"10.1214/21-ss134","DOIUrl":"https://doi.org/10.1214/21-ss134","url":null,"abstract":": In this paper, we survey some recent results on statistical infer- ence (parametric and nonparametric statistical estimation, hypotheses testing) about the spectrum of stationary models with tapered data. We also discuss some questions concerning tapered Toeplitz matrices and operators, central limit theorems for tapered Toeplitz type quadratic functionals, and tapered Fej´er-type kernels and singular integrals. These are the main tools for obtaining the corresponding results, and also are of interest in them- selves. The processes considered will be discrete-time and continuous-time Gaussian, linear or L´evy-driven linear processes with memory.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"25 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78152907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flexible, boundary adapted, nonparametric methods for the estimation of univariate piecewise-smooth functions","authors":"U. Amato, A. Antoniadis, I. Feis","doi":"10.1214/20-ss128","DOIUrl":"https://doi.org/10.1214/20-ss128","url":null,"abstract":": We present and compare some nonparametric estimation meth- ods (wavelet and/or spline-based) designed to recover a one-dimensional piecewise-smooth regression function in both a fixed equidistant or not equidistant design regression model and a random design model. Wavelet methods are known to be very competitive in terms of denois- ing and compression, due to the simultaneous localization property of a function in time and frequency. However, boundary assumptions, such as periodicity or symmetry, generate bias and artificial wiggles which degrade overall accuracy. Simple methods have been proposed in the literature for reducing the bias at the boundaries. We introduce new ones based on adaptive combinations of two estimators. The underlying idea is to combine a highly accurate method for non-regular functions, e.g., wavelets, with one well behaved at boundaries, e.g., Splines or Local Polynomial. We provide some asymptotic optimal results supporting our approach. All the methods can handle data with a random design. We also sketch some generalization to the multidimensional setting. the performance of the proposed approaches we have an extensive set of simulations on synthetic data. An interesting regression analysis of two real data applications using these procedures unambiguously demonstrates their effectiveness.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"25 6 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79741865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Hirschauer, Sven Gruener, O. Musshoff, C. Becker, Antje Jantsch
{"title":"Can $p$-values be meaningfully interpreted without random sampling?","authors":"N. Hirschauer, Sven Gruener, O. Musshoff, C. Becker, Antje Jantsch","doi":"10.31235/osf.io/yazr8","DOIUrl":"https://doi.org/10.31235/osf.io/yazr8","url":null,"abstract":"Besides the inferential errors that abound in the interpretation of p-values, the probabilistic pre-conditions (i.e. random sampling or equivalent) for using them at all are not often met by observa-tional studies in the social sciences. This paper systematizes different sampling designs and discusses the restrictive requirements of data collection that are the sine-qua-non for using p-values.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"46 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2019-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88393893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Additive monotone regression in high and lower dimensions","authors":"S. Engebretsen, I. Glad","doi":"10.1214/19-SS124","DOIUrl":"https://doi.org/10.1214/19-SS124","url":null,"abstract":"In numerous problems where the aim is to estimate the effect of a predictor variable on a response, one can assume a monotone relationship. For example, dose-effect models in medicine are of this type. In a multiple regression setting, additive monotone regression models assume that each predictor has a monotone effect on the response. In this paper, we present an overview and comparison of very recent frequentist methods for fitting additive monotone regression models. Three of the methods we present can be used both in the high dimensional setting, where the number of parameters p exceeds the number of observations n, and in the classical multiple setting where 1 < p ≤ n. However, many of the most recent methods only apply to the classical setting. The methods are compared through simulation experiments in terms of efficiency, prediction error and variable selection properties in both settings, and they are applied to the Boston housing data. We conclude with some recommendations on when the various methods perform best. MSC 2010 subject classifications: Primary 62G08.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"1 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72516876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Statistics SurveysPub Date : 2019-01-01Epub Date: 2019-11-06DOI: 10.1214/19-SS126
John J Dziak, Donna L Coffman, Matthew Reimherr, Justin Petrovich, Runze Li, Saul Shiffman, Mariya P Shiyko
{"title":"Scalar-on-function regression for predicting distal outcomes from intensively gathered longitudinal data: Interpretability for applied scientists.","authors":"John J Dziak, Donna L Coffman, Matthew Reimherr, Justin Petrovich, Runze Li, Saul Shiffman, Mariya P Shiyko","doi":"10.1214/19-SS126","DOIUrl":"10.1214/19-SS126","url":null,"abstract":"<p><p>Researchers are sometimes interested in predicting a distal or external outcome (such as smoking cessation at follow-up) from the trajectory of an intensively recorded longitudinal variable (such as urge to smoke). This can be done in a semiparametric way via scalar-on-function regression. However, the resulting fitted coefficient regression function requires special care for correct interpretation, as it represents the joint relationship of time points to the outcome, rather than a marginal or cross-sectional relationship. We provide practical guidelines, based on experience with scientific applications, for helping practitioners interpret their results and illustrate these ideas using data from a smoking cessation study.</p>","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"13 ","pages":"150-180"},"PeriodicalIF":11.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6863606/pdf/nihms-1058328.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49683490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PLS for Big Data: A unified parallel algorithm for regularised group PLS","authors":"P. L. D. Micheaux, B. Liquet, Matthew Sutton","doi":"10.1214/19-ss125","DOIUrl":"https://doi.org/10.1214/19-ss125","url":null,"abstract":"Partial Least Squares (PLS) methods have been heavily exploited to analyse the association between two blocks of data. These powerful approaches can be applied to data sets where the number of variables is greater than the number of observations and in the presence of high collinearity between variables. Different sparse versions of PLS have been developed to integrate multiple data sets while simultaneously selecting the contributing variables. Sparse modeling is a key factor in obtaining better estimators and identifying associations between multiple data sets. The cornerstone of the sparse PLS methods is the link between the singular value decomposition (SVD) of a matrix (constructed from deflated versions of the original data) and least squares minimization in linear regression. We review four popular PLS methods for two blocks of data. A unified algorithm is proposed to perform all four types of PLS including their regularised versions. We present various approaches to decrease the computation time and show how the whole procedure can be scalable to big data sets. The bigsgPLS R package implements our unified algorithm and is available at https://github.com/matt-sutton/bigsgPLS. MSC 2010 subject classifications: Primary 6202, 62J99.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"148 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75176757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Halfspace depth and floating body","authors":"Stanislav Nagy, C. Schuett, E. Werner","doi":"10.1214/19-SS123","DOIUrl":"https://doi.org/10.1214/19-SS123","url":null,"abstract":"Little known relations of the renown concept of the halfspace depth for multivariate data with notions from convex and affine geometry are discussed. Halfspace depth may be regarded as a measure of symmetry for random vectors. As such, the depth stands as a generalization of a measure of symmetry for convex sets, well studied in geometry. Under a mild assumption, the upper level sets of the halfspace depth coincide with the convex floating bodies used in the definition of the affine surface area for convex bodies in Euclidean spaces. These connections enable us to partially resolve some persistent open problems regarding theoretical properties of the depth.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"51 1","pages":""},"PeriodicalIF":3.3,"publicationDate":"2018-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74013232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A design-sensitive approach to fitting regression models with complex survey data","authors":"P. Kott","doi":"10.1214/17-SS118","DOIUrl":"https://doi.org/10.1214/17-SS118","url":null,"abstract":": Fitting complex survey data to regression equations is explored under a design-sensitive model-based framework. A robust version of the standard model assumes that the expected value of the difference between the dependent variable and its model-based prediction is zero no matter what the values of the explanatory variables. The extended model assumes only that the difference is uncorrelated with the covariates. Little is assumed about the error structure of this difference under either model other than independence across primary sampling units. The standard model often fails in practice, but the extended model very rarely does. Under this framework some of the methods developed in the conventional design-based, pseudo-maximum-likelihood framework, such as fitting weighted estimating equations and sandwich mean-squared-error estimation, are retained but their interpretations change. Few of the ideas here are new to the refereed literature. The goal instead is to collect those ideas and put them into a unified conceptual framework. regression models. We will explore an alternative model-based framework for estimating regression models introduced in Kott (2007) that is","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"41 1","pages":"1-17"},"PeriodicalIF":3.3,"publicationDate":"2018-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75137666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Statistics SurveysPub Date : 2018-01-01Epub Date: 2018-09-03DOI: 10.1214/18-SS121
Bomin Kim, Kevin H Lee, Lingzhou Xue, Xiaoyue Niu
{"title":"A review of dynamic network models with latent variables.","authors":"Bomin Kim, Kevin H Lee, Lingzhou Xue, Xiaoyue Niu","doi":"10.1214/18-SS121","DOIUrl":"https://doi.org/10.1214/18-SS121","url":null,"abstract":"<p><p>We present a selective review of statistical modeling of dynamic networks. We focus on models with latent variables, specifically, the latent space models and the latent class models (or stochastic blockmodels), which investigate both the observed features and the unobserved structure of networks. We begin with an overview of the static models, and then we introduce the dynamic extensions. For each dynamic model, we also discuss its applications that have been studied in the literature, with the data source listed in Appendix. Based on the review, we summarize a list of open problems and challenges in dynamic network modeling with latent variables.</p>","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"12 ","pages":"105-135"},"PeriodicalIF":3.3,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/18-SS121","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41215732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Hirschauer, Sven Grüner, O. Musshoff, C. Becker
{"title":"Pitfalls of significance testing and $p$-value variability: An econometrics perspective","authors":"N. Hirschauer, Sven Grüner, O. Musshoff, C. Becker","doi":"10.1214/18-SS122","DOIUrl":"https://doi.org/10.1214/18-SS122","url":null,"abstract":"Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paper discusses the question of what we can(not) learn from the p-value, which is still widely considered as the gold standard of statistical validity. We aim to provide a non-technical and easily accessible resource for statistical practitioners who wish to spot and avoid misinterpretations and misuses of statistical significance tests. For this purpose, we first classify and describe the most widely discussed (“classical”) pitfalls of significance testing, and review published work on these misuses with a focus on regression-based “confirmatory” study. This includes a description of the single-study bias and a simulation-based illustration of how proper meta-analysis compares to misleading significance counts (“vote counting”). Going beyond the classical pitfalls, we also use simulation to provide intuition that relying on the statistical estimate “p-value” as a measure of evidence without considering its sample-to-sample variability falls short of the mark even within an otherwise appropriate interpretation. We conclude with a discussion of the","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"44 1","pages":"136-172"},"PeriodicalIF":3.3,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90925204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}