arXiv - STAT - Statistics Theory最新文献

筛选
英文 中文
Functional Adaptive Huber Linear Regression 功能自适应胡贝尔线性回归
arXiv - STAT - Statistics Theory Pub Date : 2024-09-17 DOI: arxiv-2409.11053
Ling Peng, Xiaohui Liu, Heng Lian
{"title":"Functional Adaptive Huber Linear Regression","authors":"Ling Peng, Xiaohui Liu, Heng Lian","doi":"arxiv-2409.11053","DOIUrl":"https://doi.org/arxiv-2409.11053","url":null,"abstract":"Robust estimation has played an important role in statistical and machine\u0000learning. However, its applications to functional linear regression are still\u0000under-developed. In this paper, we focus on Huber's loss with a diverging\u0000robustness parameter which was previously used in parametric models. Compared\u0000to other robust methods such as median regression, the distinction is that the\u0000proposed method aims to estimate the conditional mean robustly, instead of\u0000estimating the conditional median. We only require $(1+kappa)$-th moment\u0000assumption ($kappa>0$) on the noise distribution, and the established error\u0000bounds match the optimal rate in the least-squares case as soon as $kappage\u00001$. We establish convergence rate in probability when the functional predictor\u0000has a finite 4-th moment, and finite-sample bound with exponential tail when\u0000the functional predictor is Gaussian, in terms of both prediction error and\u0000$L^2$ error. The results also extend to the case of functional estimation in a\u0000reproducing kernel Hilbert space (RKHS).","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Permutation groups, partition lattices and block structures 置换群、分割网格和块结构
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10461
Marina Anagnostopoulou-Merkouri, R. A. Bailey, Peter J. Cameron
{"title":"Permutation groups, partition lattices and block structures","authors":"Marina Anagnostopoulou-Merkouri, R. A. Bailey, Peter J. Cameron","doi":"arxiv-2409.10461","DOIUrl":"https://doi.org/arxiv-2409.10461","url":null,"abstract":"Let $G$ be a transitive permutation group on $Omega$. The $G$-invariant\u0000partitions form a sublattice of the lattice of all partitions of $Omega$,\u0000having the further property that all its elements are uniform (that is, have\u0000all parts of the same size). If, in addition, all the equivalence relations\u0000defining the partitions commute, then the relations form an emph{orthogonal\u0000block structure}, a concept from statistics; in this case the lattice is\u0000modular. If it is distributive, then we have a emph{poset block structure},\u0000whose automorphism group is a emph{generalised wreath product}. We examine\u0000permutation groups with these properties, which we call the emph{OB property}\u0000and emph{PB property} respectively, and in particular investigate when direct\u0000and wreath products of groups with these properties also have these properties. A famous theorem on permutation groups asserts that a transitive imprimitive\u0000group $G$ is embeddable in the wreath product of two factors obtained from the\u0000group (the group induced on a block by its setwise stabiliser, and the group\u0000induced on the set of blocks by~$G$). We extend this theorem to groups with the\u0000PB property, embeddng them into generalised wreath products. We show that the\u0000map from posets to generalised wreath products preserves intersections and\u0000inclusions. We have included background and historical material on these concepts.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variance Residual Life Ageing Intensity Function 方差残差 生命老化强度函数
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10591
Ashutosh Singh
{"title":"Variance Residual Life Ageing Intensity Function","authors":"Ashutosh Singh","doi":"arxiv-2409.10591","DOIUrl":"https://doi.org/arxiv-2409.10591","url":null,"abstract":"Quantitative measurement of ageing across systems and components is crucial\u0000for accurately assessing reliability and predicting failure probabilities. This\u0000measurement supports effective maintenance scheduling, performance\u0000optimisation, and cost management. Examining the ageing characteristics of a\u0000system that operates beyond a specified time $t > 0$ yields valuable insights.\u0000This paper introduces a novel metric for ageing, termed the Variance Residual\u0000Life Ageing Intensity (VRLAI) function, and explores its properties across\u0000various probability distributions. Additionally, we characterise the closure\u0000properties of the two ageing classes defined by the VRLAI function. We propose\u0000a new ordering, called the Variance Residual Life Ageing Intensity (VRLAI)\u0000ordering, and discuss its various properties. Furthermore, we examine the\u0000closure of the VRLAI order under coherent systems.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning large softmax mixtures with warm start EM 利用热启动电磁学习大型软最大混合物
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.09903
Xin Bing, Florentina Bunea, Jonathan Niles-Weed, Marten Wegkamp
{"title":"Learning large softmax mixtures with warm start EM","authors":"Xin Bing, Florentina Bunea, Jonathan Niles-Weed, Marten Wegkamp","doi":"arxiv-2409.09903","DOIUrl":"https://doi.org/arxiv-2409.09903","url":null,"abstract":"Mixed multinomial logits are discrete mixtures introduced several decades ago\u0000to model the probability of choosing an attribute from $p$ possible candidates,\u0000in heterogeneous populations. The model has recently attracted attention in the\u0000AI literature, under the name softmax mixtures, where it is routinely used in\u0000the final layer of a neural network to map a large number $p$ of vectors in\u0000$mathbb{R}^L$ to a probability vector. Despite its wide applicability and\u0000empirical success, statistically optimal estimators of the mixture parameters,\u0000obtained via algorithms whose running time scales polynomially in $L$, are not\u0000known. This paper provides a solution to this problem for contemporary\u0000applications, such as large language models, in which the mixture has a large\u0000number $p$ of support points, and the size $N$ of the sample observed from the\u0000mixture is also large. Our proposed estimator combines two classical\u0000estimators, obtained respectively via a method of moments (MoM) and the\u0000expectation-minimization (EM) algorithm. Although both estimator types have\u0000been studied, from a theoretical perspective, for Gaussian mixtures, no similar\u0000results exist for softmax mixtures for either procedure. We develop a new MoM\u0000parameter estimator based on latent moment estimation that is tailored to our\u0000model, and provide the first theoretical analysis for a MoM-based procedure in\u0000softmax mixtures. Although consistent, MoM for softmax mixtures can exhibit\u0000poor numerical performance, as observed other mixture models. Nevertheless, as\u0000MoM is provably in a neighborhood of the target, it can be used as warm start\u0000for any iterative algorithm. We study in detail the EM algorithm, and provide\u0000its first theoretical analysis for softmax mixtures. Our final proposal for\u0000parameter estimation is the EM algorithm with a MoM warm start.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extending the Gini Index to Higher Dimensions via Whitening Processes 通过白化过程将基尼指数扩展到更高维度
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10119
Gennaro Auricchio, Paolo Giudici, Giuseppe Toscani
{"title":"Extending the Gini Index to Higher Dimensions via Whitening Processes","authors":"Gennaro Auricchio, Paolo Giudici, Giuseppe Toscani","doi":"arxiv-2409.10119","DOIUrl":"https://doi.org/arxiv-2409.10119","url":null,"abstract":"Measuring the degree of inequality expressed by a multivariate statistical\u0000distribution is a challenging problem, which appears in many fields of science\u0000and engineering. In this paper, we propose to extend the well known univariate\u0000Gini coefficient to multivariate distributions, by maintaining most of its\u0000properties. Our extension is based on the application of whitening processes\u0000that possess the property of scale stability.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"104 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning with Sparsely Permuted Data: A Robust Bayesian Approach 利用稀疏堆积数据学习:稳健的贝叶斯方法
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10678
Abhisek Chakraborty, Saptati Datta
{"title":"Learning with Sparsely Permuted Data: A Robust Bayesian Approach","authors":"Abhisek Chakraborty, Saptati Datta","doi":"arxiv-2409.10678","DOIUrl":"https://doi.org/arxiv-2409.10678","url":null,"abstract":"Data dispersed across multiple files are commonly integrated through\u0000probabilistic linkage methods, where even minimal error rates in record\u0000matching can significantly contaminate subsequent statistical analyses. In\u0000regression problems, we examine scenarios where the identifiers of predictors\u0000or responses are subject to an unknown permutation, challenging the assumption\u0000of correspondence. Many emerging approaches in the literature focus on sparsely\u0000permuted data, where only a small subset of pairs ($k << n$) are affected by\u0000the permutation, treating these permuted entries as outliers to restore\u0000original correspondence and obtain consistent estimates of regression\u0000parameters. In this article, we complement the existing literature by\u0000introducing a novel generalized robust Bayesian formulation of the problem. We\u0000develop an efficient posterior sampling scheme by adapting the fractional\u0000posterior framework and addressing key computational bottlenecks via careful\u0000use of discrete optimal transport and sampling in the space of binary matrices\u0000with fixed margins. Further, we establish new posterior contraction results\u0000within this framework, providing theoretical guarantees for our approach. The\u0000utility of the proposed framework is demonstrated via extensive numerical\u0000experiments.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consistent complete independence test in high dimensions based on Chatterjee correlation coefficient 基于 Chatterjee 相关系数的高维一致完全独立测试
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10315
Liqi Xia, Ruiyuan Cao, Jiang Du, Jun Dai
{"title":"Consistent complete independence test in high dimensions based on Chatterjee correlation coefficient","authors":"Liqi Xia, Ruiyuan Cao, Jiang Du, Jun Dai","doi":"arxiv-2409.10315","DOIUrl":"https://doi.org/arxiv-2409.10315","url":null,"abstract":"In this article, we consider the complete independence test of\u0000high-dimensional data. Based on Chatterjee coefficient, we pioneer the\u0000development of quadratic test and extreme value test which possess good testing\u0000performance for oscillatory data, and establish the corresponding large sample\u0000properties under both null hypotheses and alternative hypotheses. In order to\u0000overcome the shortcomings of quadratic statistic and extreme value statistic,\u0000we propose a testing method termed as power enhancement test by adding a\u0000screening statistic to the quadratic statistic. The proposed method do not\u0000reduce the testing power under dense alternative hypotheses, but can enhance\u0000the power significantly under sparse alternative hypotheses. Three synthetic\u0000data examples and two real data examples are further used to illustrate the\u0000performance of our proposed methods.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privately Learning Smooth Distributions on the Hypercube by Projections 通过投影私人学习超立方体上的平滑分布
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10083
Clément LalanneTSE-R, Sébastien GadatTSE-R, IUF
{"title":"Privately Learning Smooth Distributions on the Hypercube by Projections","authors":"Clément LalanneTSE-R, Sébastien GadatTSE-R, IUF","doi":"arxiv-2409.10083","DOIUrl":"https://doi.org/arxiv-2409.10083","url":null,"abstract":"Fueled by the ever-increasing need for statistics that guarantee the privacy\u0000of their training sets, this article studies the centrally-private estimation\u0000of Sobolev-smooth densities of probability over the hypercube in dimension d.\u0000The contributions of this article are two-fold : Firstly, it generalizes the\u0000one dimensional results of (Lalanne et al., 2023) to non-integer levels of\u0000smoothness and to a high-dimensional setting, which is important for two\u0000reasons : it is more suited for modern learning tasks, and it allows\u0000understanding the relations between privacy, dimensionality and smoothness,\u0000which is a central question with differential privacy. Secondly, this article\u0000presents a private strategy of estimation that is data-driven (usually referred\u0000to as adaptive in Statistics) in order to privately choose an estimator that\u0000achieves a good bias-variance trade-off among a finite family of private\u0000projection estimators without prior knowledge of the ground-truth smoothness\u0000$beta$. This is achieved by adapting the Lepskii method for private selection,\u0000by adding a new penalization term that makes the estimation privacy-aware.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mean Residual Life Ageing Intensity Function 平均残余寿命老化强度函数
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.10456
Ashutosh Singh, Ishapathik Das, Asok Kumar Nanda, Sumen Sen
{"title":"Mean Residual Life Ageing Intensity Function","authors":"Ashutosh Singh, Ishapathik Das, Asok Kumar Nanda, Sumen Sen","doi":"arxiv-2409.10456","DOIUrl":"https://doi.org/arxiv-2409.10456","url":null,"abstract":"The ageing intensity function is a powerful analytical tool that provides\u0000valuable insights into the ageing process across diverse domains such as\u0000reliability engineering, actuarial science, and healthcare. Its applications\u0000continue to expand as researchers delve deeper into understanding the complex\u0000dynamics of ageing and its implications for society. One common approach to\u0000defining the ageing intensity function is through the hazard rate or failure\u0000rate function, extensively explored in scholarly literature. Equally\u0000significant to the hazard rate function is the mean residual life function,\u0000which plays a crucial role in analyzing the ageing patterns exhibited by units\u0000or components. This article introduces the mean residual life ageing intensity\u0000(MRLAI) function to delve into component ageing behaviours across various\u0000distributions. Additionally, we scrutinize the closure properties of the MRLAI\u0000function across different reliability operations. Furthermore, a new order\u0000termed the mean residual life ageing intensity order is defined to analyze the\u0000ageing behaviour of a system, and the closure property of this order under\u0000various reliability operations is discussed.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"209 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data 实现半参数数据与个体层面数据融合的统一理论
arXiv - STAT - Statistics Theory Pub Date : 2024-09-16 DOI: arxiv-2409.09973
Ellen GrahamUniversity of Washington, Marco CaroneUniversity of Washington, Andrea RotnitzkyUniversity of Washington
{"title":"Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data","authors":"Ellen GrahamUniversity of Washington, Marco CaroneUniversity of Washington, Andrea RotnitzkyUniversity of Washington","doi":"arxiv-2409.09973","DOIUrl":"https://doi.org/arxiv-2409.09973","url":null,"abstract":"We address the goal of conducting inference about a smooth finite-dimensional\u0000parameter by utilizing individual-level data from various independent sources.\u0000Recent advancements have led to the development of a comprehensive theory\u0000capable of handling scenarios where different data sources align with, possibly\u0000distinct subsets of, conditional distributions of a single factorization of the\u0000joint target distribution. While this theory proves effective in many\u0000significant contexts, it falls short in certain common data fusion problems,\u0000such as two-sample instrumental variable analysis, settings that integrate data\u0000from epidemiological studies with diverse designs (e.g., prospective cohorts\u0000and retrospective case-control studies), and studies with variables prone to\u0000measurement error that are supplemented by validation studies. In this paper,\u0000we extend the aforementioned comprehensive theory to allow for the fusion of\u0000individual-level data from sources aligned with conditional distributions that\u0000do not correspond to a single factorization of the target distribution.\u0000Assuming conditional and marginal distribution alignments, we provide universal\u0000results that characterize the class of all influence functions of regular\u0000asymptotically linear estimators and the efficient influence function of any\u0000pathwise differentiable parameter, irrespective of the number of data sources,\u0000the specific parameter of interest, or the statistical model for the target\u0000distribution. This theory paves the way for machine-learning debiased,\u0000semiparametric efficient estimation.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信