Avhad Ganesh Vishnu, Ananya Lahiri, Sudheesh K. Kattumannil
{"title":"Jackknife Empirical Likelihood Ratio Test for Cauchy Distribution","authors":"Avhad Ganesh Vishnu, Ananya Lahiri, Sudheesh K. Kattumannil","doi":"arxiv-2409.05764","DOIUrl":"https://doi.org/arxiv-2409.05764","url":null,"abstract":"Heavy-tailed distributions, such as the Cauchy distribution, are acknowledged\u0000for providing more accurate models for financial returns, as the normal\u0000distribution is deemed insufficient for capturing the significant fluctuations\u0000observed in real-world assets. Data sets characterized by outlier sensitivity\u0000are critically important in diverse areas, including finance, economics,\u0000telecommunications, and signal processing. This article addresses a\u0000goodness-of-fit test for the Cauchy distribution. The proposed test utilizes\u0000empirical likelihood methods, including the jackknife empirical likelihood\u0000(JEL) and adjusted jackknife empirical likelihood (AJEL). Extensive Monte Carlo\u0000simulation studies are conducted to evaluate the finite sample performance of\u0000the proposed test. The application of the proposed test is illustrated through\u0000the analysing two real data sets.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter estimation for fractional stochastic heat equations : Berry-Esséen bounds in CLTs","authors":"Soukaina Douissi, Fatimah Alshahrani","doi":"arxiv-2409.05416","DOIUrl":"https://doi.org/arxiv-2409.05416","url":null,"abstract":"The aim of this work is to estimate the drift coefficient of a fractional\u0000heat equation driven by an additive space-time noise using the Maximum\u0000likelihood estimator (MLE). In the first part of the paper, the first $N$\u0000Fourier modes of the solution are observed continuously over a finite time\u0000interval $[0, T ]$. The explicit upper bounds for the Wasserstein distance for\u0000the central limit theorem of the MLE is provided when $N rightarrow infty$\u0000and/or $T rightarrow infty$. While in the second part of the paper, the $N$\u0000Fourier modes are observed at uniform time grid : $t_i = i frac{T}{M}$,\u0000$i=0,..,M,$ where $M$ is the number of time grid points. The consistency and\u0000asymptotic normality are studied when $T,M,N rightarrow + infty$ in addition\u0000to the rate of convergence in law in the CLT.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On integer partitions and the Wilcoxon rank-sum statistic","authors":"Andrew V. Sills","doi":"arxiv-2409.05741","DOIUrl":"https://doi.org/arxiv-2409.05741","url":null,"abstract":"In the literature, derivations of exact null distributions of rank-sum\u0000statistics is often avoided in cases where one or more ties exist in the data.\u0000By deriving the null distribution in the no-ties case with the aid of classical\u0000$q$-series results of Euler and Rothe, we demonstrate how a natural\u0000generalization of the method may be employed to derive exact null distributions\u0000even when one or more ties are present in the data. It is suggested that this\u0000method could be implemented in a computer algebra system, or even a more\u0000primitive computer language, so that the normal approximation need not be\u0000employed in the case of small sample sizes, when it is less likely to be very\u0000accurate. Several algorithms for determining exact distributions of the\u0000rank-sum statistic (possibly with ties) have been given in the literature (see\u0000Streitberg and R\"ohmel (1986) and Marx et al. (2016)), but none seem as simple\u0000as the procedure discussed here which amounts to multiplying out a certain\u0000polynomial, extracting coefficients, and finally dividing by a binomal\u0000coefficient.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical Bernstein in smooth Banach spaces","authors":"Diego Martinez-Taboada, Aaditya Ramdas","doi":"arxiv-2409.06060","DOIUrl":"https://doi.org/arxiv-2409.06060","url":null,"abstract":"Existing concentration bounds for bounded vector-valued random variables\u0000include extensions of the scalar Hoeffding and Bernstein inequalities. While\u0000the latter is typically tighter, it requires knowing a bound on the variance of\u0000the random variables. We derive a new vector-valued empirical Bernstein\u0000inequality, which makes use of an empirical estimator of the variance instead\u0000of the true variance. The bound holds in 2-smooth separable Banach spaces,\u0000which include finite dimensional Euclidean spaces and separable Hilbert spaces.\u0000The resulting confidence sets are instantiated for both the batch setting\u0000(where the sample size is fixed) and the sequential setting (where the sample\u0000size is a stopping time). The confidence set width asymptotically exactly\u0000matches that achieved by Bernstein in the leading term. The method and\u0000supermartingale proof technique combine several tools of Pinelis (1994) and\u0000Waudby-Smith and Ramdas (2024).","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shubhada Agrawal, Prashanth L. A., Siva Theja Maguluri
{"title":"Markov Chain Variance Estimation: A Stochastic Approximation Approach","authors":"Shubhada Agrawal, Prashanth L. A., Siva Theja Maguluri","doi":"arxiv-2409.05733","DOIUrl":"https://doi.org/arxiv-2409.05733","url":null,"abstract":"We consider the problem of estimating the asymptotic variance of a function\u0000defined on a Markov chain, an important step for statistical inference of the\u0000stationary mean. We design the first recursive estimator that requires $O(1)$\u0000computation at each step, does not require storing any historical samples or\u0000any prior knowledge of run-length, and has optimal $O(frac{1}{n})$ rate of\u0000convergence for the mean-squared error (MSE) with provable finite sample\u0000guarantees. Here, $n$ refers to the total number of samples generated. The\u0000previously best-known rate of convergence in MSE was $O(frac{log n}{n})$,\u0000achieved by jackknifed estimators, which also do not enjoy these other\u0000desirable properties. Our estimator is based on linear stochastic approximation\u0000of an equivalent formulation of the asymptotic variance in terms of the\u0000solution of the Poisson equation. We generalize our estimator in several directions, including estimating the\u0000covariance matrix for vector-valued functions, estimating the stationary\u0000variance of a Markov chain, and approximately estimating the asymptotic\u0000variance in settings where the state space of the underlying Markov chain is\u0000large. We also show applications of our estimator in average reward\u0000reinforcement learning (RL), where we work with asymptotic variance as a risk\u0000measure to model safety-critical applications. We design a temporal-difference\u0000type algorithm tailored for policy evaluation in this context. We consider both\u0000the tabular and linear function approximation settings. Our work paves the way\u0000for developing actor-critic style algorithms for variance-constrained RL.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical likelihood for generalized smoothly trimmed mean","authors":"Elina Kresse, Emils Silins, Janis Valeinis","doi":"arxiv-2409.05631","DOIUrl":"https://doi.org/arxiv-2409.05631","url":null,"abstract":"This paper introduces a new version of the smoothly trimmed mean with a more\u0000general version of weights, which can be used as an alternative to the\u0000classical trimmed mean. We derive its asymptotic variance and to further\u0000investigate its properties we establish the empirical likelihood for the new\u0000estimator. As expected from previous theoretical investigations we show in our\u0000simulations a clear advantage of the proposed estimator over the classical\u0000trimmed mean estimator. Moreover, the empirical likelihood method gives an\u0000additional advantage for data generated from contaminated models. For the\u0000classical trimmed mean it is generally recommended in practice to use\u0000symmetrical 10% or 20% trimming. However, if the trimming is done close to\u0000data gaps, it can even lead to spurious results, as known from the literature\u0000and verified by our simulations. Instead, for practical data examples, we\u0000choose the smoothing parameters by an optimality criterion that minimises the\u0000variance of the proposed estimators.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient estimation with incomplete data via generalised ANOVA decomposition","authors":"Thomas B. Berrett","doi":"arxiv-2409.05729","DOIUrl":"https://doi.org/arxiv-2409.05729","url":null,"abstract":"We study the efficient estimation of a class of mean functionals in settings\u0000where a complete multivariate dataset is complemented by additional datasets\u0000recording subsets of the variables of interest. These datasets are allowed to\u0000have a general, in particular non-monotonic, structure. Our main contribution\u0000is to characterise the asymptotic minimal mean squared error for these problems\u0000and to introduce an estimator whose risk approximately matches this lower\u0000bound. We show that the efficient rescaled variance can be expressed as the\u0000minimal value of a quadratic optimisation problem over a function space, thus\u0000establishing a fundamental link between these estimation problems and the\u0000theory of generalised ANOVA decompositions. Our estimation procedure uses\u0000iterated nonparametric regression to mimic an approximate influence function\u0000derived through gradient descent. We prove that this estimator is approximately\u0000normally distributed, provide an estimator of its variance and thus develop\u0000confidence intervals of asymptotically minimal width. Finally we study a more\u0000direct estimator, which can be seen as a U-statistic with a data-dependent\u0000kernel, showing that it is also efficient under stronger regularity conditions.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"396 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Mechanics of Min-Max Problems","authors":"Yuma Ichikawa, Koji Hukushima","doi":"arxiv-2409.06053","DOIUrl":"https://doi.org/arxiv-2409.06053","url":null,"abstract":"Min-max optimization problems, also known as saddle point problems, have\u0000attracted significant attention due to their applications in various fields,\u0000such as fair beamforming, generative adversarial networks (GANs), and\u0000adversarial learning. However, understanding the properties of these min-max\u0000problems has remained a substantial challenge. This study introduces a\u0000statistical mechanical formalism for analyzing the equilibrium values of\u0000min-max problems in the high-dimensional limit, while appropriately addressing\u0000the order of operations for min and max. As a first step, we apply this\u0000formalism to bilinear min-max games and simple GANs, deriving the relationship\u0000between the amount of training data and generalization error and indicating the\u0000optimal ratio of fake to real data for effective learning. This formalism\u0000provides a groundwork for a deeper theoretical analysis of the equilibrium\u0000properties in various machine learning methods based on min-max problems and\u0000encourages the development of new algorithms and architectures.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Common or specific source, features or scores; it is all a matter of information","authors":"Aafko Boonstra, Ronald Meester, Klaas Slooten","doi":"arxiv-2409.05403","DOIUrl":"https://doi.org/arxiv-2409.05403","url":null,"abstract":"We show that the incorporation of any new piece of information allows for\u0000improved decision making in the sense that the expected costs of an optimal\u0000decision decrease (or, in boundary cases where no or not enough new information\u0000is incorporated, stays the same) whenever this is done by the appropriate\u0000update of the probabilities of the hypotheses. Versions of this result have\u0000been stated before. However, previous proofs rely on auxiliary constructions\u0000with proper scoring rules. We, instead, offer a direct and completely general\u0000proof by considering elementary properties of likelihood ratios only. We do\u0000point out the relation to proper scoring rules. We apply our results to make a\u0000contribution to the debates about the use of score based/feature based and\u0000common/specific source likelihood ratios. In the literature these are often\u0000presented as different ``LR-systems''. We argue that deciding which LR to\u0000compute is simply a matter of the available information. There is no such thing\u0000as different ``LR-systems'', there are only differences in the available\u0000information. In particular, despite claims to the contrary, scores can very\u0000well be used in forensic practice and we illustrate this with an extensive\u0000example in DNA kinship context.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Continuous Generalization of Hypothesis Testing","authors":"Nick W. Koning","doi":"arxiv-2409.05654","DOIUrl":"https://doi.org/arxiv-2409.05654","url":null,"abstract":"Testing has developed into the fundamental statistical framework for\u0000falsifying hypotheses. Unfortunately, tests are binary in nature: a test either\u0000rejects a hypothesis or not. Such binary decisions do not reflect the reality\u0000of many scientific studies, which often aim to present the evidence against a\u0000hypothesis and do not necessarily intend to establish a definitive conclusion.\u0000To solve this, we propose the continuous generalization of a test, which we use\u0000to measure the evidence against a hypothesis. Such a continuous test can be\u0000interpreted as a non-randomized interpretation of the classical 'randomized\u0000test'. This offers the benefits of a randomized test, without the downsides of\u0000external randomization. Another interpretation is as a literal measure, which\u0000measures the amount of binary tests that reject the hypothesis. Our work also\u0000offers a new perspective on the $e$-value: the $e$-value is recovered as a\u0000continuous test with $alpha to 0$, or as an unbounded measure of the amount\u0000of rejections.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}