{"title":"Stein estimation in a multivariate setting","authors":"Adrian Fischer, Robert E. Gaunt, Yvik Swan","doi":"arxiv-2312.09344","DOIUrl":"https://doi.org/arxiv-2312.09344","url":null,"abstract":"We use Stein characterisations to derive new moment-type estimators for the\u0000parameters of several multivariate distributions in the i.i.d. case; we also\u0000derive the asymptotic properties of these estimators. Our examples include the\u0000multivariate truncated normal distribution and several spherical distributions.\u0000The estimators are explicit and therefore provide an interesting alternative to\u0000the maximum-likelihood estimator. The quality of these estimators is assessed\u0000through competitive simulation studies in which we compare their behaviour to\u0000the performance of other estimators available in the literature.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138716461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter Inference for Hypo-Elliptic Diffusions under a Weak Design Condition","authors":"Yuga Iguchi, Alexandros Beskos","doi":"arxiv-2312.04444","DOIUrl":"https://doi.org/arxiv-2312.04444","url":null,"abstract":"We address the problem of parameter estimation for degenerate diffusion\u0000processes defined via the solution of Stochastic Differential Equations (SDEs)\u0000with diffusion matrix that is not full-rank. For this class of hypo-elliptic\u0000diffusions recent works have proposed contrast estimators that are\u0000asymptotically normal, provided that the step-size in-between observations\u0000$Delta=Delta_n$ and their total number $n$ satisfy $n to infty$, $n\u0000Delta_n to infty$, $Delta_n to 0$, and additionally $Delta_n = o\u0000(n^{-1/2})$. This latter restriction places a requirement for a so-called\u0000`rapidly increasing experimental design'. In this paper, we overcome this\u0000limitation and develop a general contrast estimator satisfying asymptotic\u0000normality under the weaker design condition $Delta_n = o(n^{-1/p})$ for\u0000general $p ge 2$. Such a result has been obtained for elliptic SDEs in the\u0000literature, but its derivation in a hypo-elliptic setting is highly\u0000non-trivial. We provide numerical results to illustrate the advantages of the\u0000developed theory.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"103 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138553412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstructions of piece-wise continuous and discrete functions using moments","authors":"Robert Mnatsakanov, Rafik Aramyan, Farhad Jafari","doi":"arxiv-2312.04462","DOIUrl":"https://doi.org/arxiv-2312.04462","url":null,"abstract":"The problem of recovering a moment-determinate multivariate function $f$ via\u0000its moment sequence is studied. Under mild conditions on $f$, the point-wise\u0000and $L_1$-rates of convergence for the proposed constructions are established.\u0000The cases where $f$ is the indicator function of a set, and represents a\u0000discrete probability mass function are also investigated. Calculations of the\u0000approximants and simulation studies are conducted to graphically illustrate the\u0000behavior of the approximations in several simple examples. Analytical and\u0000simulated errors of proposed approximations are recorded in Tables 1-3.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138553608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-values, Multiple Testing and Beyond","authors":"Guanxun Li, Xianyang Zhang","doi":"arxiv-2312.02905","DOIUrl":"https://doi.org/arxiv-2312.02905","url":null,"abstract":"We discover a connection between the Benjamini-Hochberg (BH) procedure and\u0000the recently proposed e-BH procedure [Wang and Ramdas, 2022] with a suitably\u0000defined set of e-values. This insight extends to a generalized version of the\u0000BH procedure and the model-free multiple testing procedure in Barber and\u0000Cand`es [2015] (BC) with a general form of rejection rules. The connection\u0000provides an effective way of developing new multiple testing procedures by\u0000aggregating or assembling e-values resulting from the BH and BC procedures and\u0000their use in different subsets of the data. In particular, we propose new\u0000multiple testing methodologies in three applications, including a hybrid\u0000approach that integrates the BH and BC procedures, a multiple testing procedure\u0000aimed at ensuring a new notion of fairness by controlling both the group-wise\u0000and overall false discovery rates (FDR), and a structure adaptive multiple\u0000testing procedure that can incorporate external covariate information to boost\u0000detection power. One notable feature of the proposed methods is that we use a\u0000data-dependent approach for assigning weights to e-values, significantly\u0000enhancing the efficiency of the resulting e-BH procedure. The construction of\u0000the weights is non-trivial and is motivated by the leave-one-out analysis for\u0000the BH and BC procedures. In theory, we prove that the proposed e-BH procedures\u0000with data-dependent weights in the three applications ensure finite sample FDR\u0000control. Furthermore, we demonstrate the efficiency of the proposed methods\u0000through numerical studies in the three applications.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"93 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for mean-field variational inference via polyhedral optimization in the Wasserstein space","authors":"Yiheng Jiang, Sinho Chewi, Aram-Alexandre Pooladian","doi":"arxiv-2312.02849","DOIUrl":"https://doi.org/arxiv-2312.02849","url":null,"abstract":"We develop a theory of finite-dimensional polyhedral subsets over the\u0000Wasserstein space and optimization of functionals over them via first-order\u0000methods. Our main application is to the problem of mean-field variational\u0000inference, which seeks to approximate a distribution $pi$ over $mathbb{R}^d$\u0000by a product measure $pi^star$. When $pi$ is strongly log-concave and\u0000log-smooth, we provide (1) approximation rates certifying that $pi^star$ is\u0000close to the minimizer $pi^star_diamond$ of the KL divergence over a\u0000emph{polyhedral} set $mathcal{P}_diamond$, and (2) an algorithm for\u0000minimizing $text{KL}(cdot|pi)$ over $mathcal{P}_diamond$ with accelerated\u0000complexity $O(sqrt kappa log(kappa d/varepsilon^2))$, where $kappa$ is\u0000the condition number of $pi$.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"86 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Characterization of Optimal Prediction Measures via $ell_1$ Minimization","authors":"Len Bos","doi":"arxiv-2312.03091","DOIUrl":"https://doi.org/arxiv-2312.03091","url":null,"abstract":"Suppose that $KsubsetC$ is compact and that $z_0inCbackslash K$ is an\u0000external point. An optimal prediction measure for regression by polynomials of\u0000degree at most $n,$ is one for which the variance of the prediction at $z_0$ is\u0000as small as possible. Hoel and Levine (cite{HL}) have considered the case of\u0000$K=[-1,1]$ and $z_0=x_0in Rbackslash [-1,1],$ where they show that the\u0000support of the optimal measure is the $n+1$ extremme points of the Chebyshev\u0000polynomial $T_n(x)$ and characterizing the optimal weights in terms of absolute\u0000values of fundamental interpolating Lagrange polynomials. More recently,\u0000cite{BLO} has given the equivalence of the optimal prediction problem with\u0000that of finding polynomials of extremal growth. They also study in detail the\u0000case of $K=[-1,1]$ and $z_0=iain iR,$ purely imaginary. In this work we\u0000generalize the Hoel-Levine formula to the general case when the support of the\u0000optimal measure is a finite set and give a formula for the optimal weights in\u0000terms of a $ell_1$ minimization problem.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138546622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Central limit theorem for the average closure coefficient","authors":"Mingao Yuan","doi":"arxiv-2312.03142","DOIUrl":"https://doi.org/arxiv-2312.03142","url":null,"abstract":"Many real-world networks exhibit the phenomenon of edge clustering, which is\u0000typically measured by the average clustering coefficient. Recently, an\u0000alternative measure, the average closure coefficient, is proposed to quantify\u0000local clustering. It is shown that the average closure coefficient possesses a\u0000number of useful properties and can capture complementary information missed by\u0000the classical average clustering coefficient. In this paper, we study the\u0000asymptotic distribution of the average closure coefficient of a heterogeneous\u0000Erd\"{o}s-R'{e}nyi random graph. We prove that the standardized average\u0000closure coefficient converges in distribution to the standard normal\u0000distribution. In the Erd\"{o}s-R'{e}nyi random graph, the variance of the\u0000average closure coefficient exhibits the same phase transition phenomenon as\u0000the average clustering coefficient.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138546790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotic Theory of the Best-Choice Rerandomization using the Mahalanobis Distance","authors":"Yuhao Wang, Xinran Li","doi":"arxiv-2312.02513","DOIUrl":"https://doi.org/arxiv-2312.02513","url":null,"abstract":"Rerandomization, a design that utilizes pretreatment covariates and improves\u0000their balance between different treatment groups, has received attention\u0000recently in both theory and practice. There are at least two types of\u0000rerandomization that are used in practice: the first rerandomizes the treatment\u0000assignment until covariate imbalance is below a prespecified threshold; the\u0000second randomizes the treatment assignment multiple times and chooses the one\u0000with the best covariate balance. In this paper we will consider the second type\u0000of rerandomization, namely the best-choice rerandomization, whose theory and\u0000inference are still lacking in the literature. In particular, we will focus on\u0000the best-choice rerandomization that uses the Mahalanobis distance to measure\u0000covariate imbalance, which is one of the most commonly used imbalance measure\u0000for multivariate covariates and is invariant to affine transformations of\u0000covariates. We will study the large-sample repeatedly sampling properties of\u0000the best-choice rerandomization, allowing both the number of covariates and the\u0000number of tried complete randomizations to increase with the sample size. We\u0000show that the asymptotic distribution of the difference-in-means estimator is\u0000more concentrated around the true average treatment effect under\u0000rerandomization than under the complete randomization, and propose large-sample\u0000accurate confidence intervals for rerandomization that are shorter than that\u0000for the completely randomized experiment. We further demonstrate that, with\u0000moderate number of covariates and with the number of tried randomizations\u0000increasing polynomially with the sample size, the best-choice rerandomization\u0000can achieve the ideally optimal precision that one can expect even with\u0000perfectly balanced covariates. The developed theory and methods for\u0000rerandomization are also illustrated using real field experiments.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"84 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances","authors":"Spencer Compton, Gregory Valiant","doi":"arxiv-2312.02417","DOIUrl":"https://doi.org/arxiv-2312.02417","url":null,"abstract":"Given data drawn from a collection of Gaussian variables with a common mean\u0000but different and unknown variances, what is the best algorithm for estimating\u0000their common mean? We present an intuitive and efficient algorithm for this\u0000task. As different closed-form guarantees can be hard to compare, the\u0000Subset-of-Signals model serves as a benchmark for heteroskedastic mean\u0000estimation: given $n$ Gaussian variables with an unknown subset of $m$\u0000variables having variance bounded by 1, what is the optimal estimation error as\u0000a function of $n$ and $m$? Our algorithm resolves this open question up to\u0000logarithmic factors, improving upon the previous best known estimation error by\u0000polynomial factors when $m = n^c$ for all $0<c<1$. Of particular note, we\u0000obtain error $o(1)$ with $m = tilde{O}(n^{1/4})$ variance-bounded samples,\u0000whereas previous work required $m = tilde{Omega}(n^{1/2})$. Finally, we show\u0000that in the multi-dimensional setting, even for $d=2$, our techniques enable\u0000rates comparable to knowing the variance of each sample.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"87 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust parameter estimation of the log-logistic distribution based on density power divergence estimators","authors":"A. Felipe, M. Jaenada, P. Miranda, L. Pardo","doi":"arxiv-2312.02662","DOIUrl":"https://doi.org/arxiv-2312.02662","url":null,"abstract":"Robust inferential methods based on divergences measures have shown an\u0000appealing trade-off between efficiency and robustness in many different\u0000statistical models. In this paper, minimum density power divergence estimators\u0000(MDPDEs) for the scale and shape parameters of the log-logistic distribution\u0000are considered. The log-logistic is a versatile distribution modeling lifetime\u0000data which is commonly adopted in survival analysis and reliability engineering\u0000studies when the hazard rate is initially increasing but then it decreases\u0000after some point. Further, it is shown that the classical estimators based on\u0000maximum likelihood (MLE) are included as a particular case of the MDPDE family.\u0000Moreover, the corresponding influence function of the MDPDE is obtained, and\u0000its boundlessness is proved, thus leading to robust estimators. A simulation\u0000study is carried out to illustrate the slight loss in efficiency of MDPDE with\u0000respect to MLE and, at besides, the considerable gain in robustness.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138547012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}