{"title":"Deviation and moment inequalities for Banach-valued $U$-statistics","authors":"Davide GiraudoIRMA, UNISTRA UFR MI","doi":"arxiv-2405.01902","DOIUrl":"https://doi.org/arxiv-2405.01902","url":null,"abstract":"We show a deviation inequality for U-statistics of independent data taking\u0000values in a separable Banach space which satisfies some smoothness assumptions.\u0000We then provide applications to rates in the law of large numbers for\u0000U-statistics, a H{\"o}lderian functional central limit theorem and a moment\u0000inequality for incomplete $U$-statistics.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimatrix variate distributions","authors":"José A. Díaz-García, Francisco J. Caro-Lopera","doi":"arxiv-2405.02498","DOIUrl":"https://doi.org/arxiv-2405.02498","url":null,"abstract":"A new family of distributions indexed by the class of matrix variate\u0000contoured elliptically distribution is proposed as an extension of some\u0000bimatrix variate distributions. The termed emph{multimatrix variate\u0000distributions} open new perspectives for the classical distribution theory,\u0000usually based on probabilistic independent models and preferred untested\u0000fitting laws. Most of the multimatrix models here derived are invariant under\u0000the spherical family, a fact that solves the testing and prior knowledge of the\u0000underlying distributions and elucidates the statistical methodology in\u0000contrasts with some weakness of current studies as copulas. The paper also\u0000includes a number of diverse special cases, properties and generalisations. The\u0000new joint distributions allows several unthinkable combinations for copulas,\u0000such as scalars, vectors and matrices, all of them adjustable to the required\u0000models of the experts. The proposed joint distributions are also easily\u0000computable, then several applications are plausible. In particular, an\u0000exhaustive example in molecular docking on SARS-CoV-2 presents the results on\u0000matrix dependent samples.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antoine Godichon-BaggioniLPSM, Wei LuLMI, Bruno PortierLMI
{"title":"A Full Adagrad algorithm with O(Nd) operations","authors":"Antoine Godichon-BaggioniLPSM, Wei LuLMI, Bruno PortierLMI","doi":"arxiv-2405.01908","DOIUrl":"https://doi.org/arxiv-2405.01908","url":null,"abstract":"A novel approach is given to overcome the computational challenges of the\u0000full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic\u0000optimization. By developing a recursive method that estimates the inverse of\u0000the square root of the covariance of the gradient, alongside a streaming\u0000variant for parameter updates, the study offers efficient and practical\u0000algorithms for large-scale applications. This innovative strategy significantly\u0000reduces the complexity and resource demands typically associated with\u0000full-matrix methods, enabling more effective optimization processes. Moreover,\u0000the convergence rates of the proposed estimators and their asymptotic\u0000efficiency are given. Their effectiveness is demonstrated through numerical\u0000studies.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"165 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Blanca E. Monroy-Castillo, M. A, Jácome, Ricardo Cao
{"title":"Improved distance correlation estimation","authors":"Blanca E. Monroy-Castillo, M. A, Jácome, Ricardo Cao","doi":"arxiv-2405.01958","DOIUrl":"https://doi.org/arxiv-2405.01958","url":null,"abstract":"Distance correlation is a novel class of multivariate dependence measure,\u0000taking positive values between 0 and 1, and applicable to random vectors of\u0000arbitrary dimensions, not necessarily equal. It offers several advantages over\u0000the well-known Pearson correlation coefficient, the most important is that\u0000distance correlation equals zero if and only if the random vectors are\u0000independent. There are two different estimators of the distance correlation available in\u0000the literature. The first one, proposed by Sz'ekely et al. (2007), is based on\u0000an asymptotically unbiased estimator of the distance covariance which turns out\u0000to be a V-statistic. The second one builds on an unbiased estimator of the\u0000distance covariance proposed in Sz'ekely et al. (2014), proved to be an\u0000U-statistic by Sz'ekely and Huo (2016). This study evaluates their efficiency\u0000(mean squared error) and compares computational times for both methods under\u0000different dependence structures. Under conditions of independence or\u0000near-independence, the V-estimates are biased, while the U-estimator frequently\u0000cannot be computed due to negative values. To address this challenge, a convex\u0000linear combination of the former estimators is proposed and studied, yielding\u0000good results regardless of the level of dependence.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finite Sample Analysis and Bounds of Generalization Error of Gradient Descent in In-Context Linear Regression","authors":"Karthik Duraisamy","doi":"arxiv-2405.02462","DOIUrl":"https://doi.org/arxiv-2405.02462","url":null,"abstract":"Recent studies show that transformer-based architectures emulate gradient\u0000descent during a forward pass, contributing to in-context learning capabilities\u0000- an ability where the model adapts to new tasks based on a sequence of prompt\u0000examples without being explicitly trained or fine tuned to do so. This work\u0000investigates the generalization properties of a single step of gradient descent\u0000in the context of linear regression with well-specified models. A random design\u0000setting is considered and analytical expressions are derived for the\u0000statistical properties of generalization error in a non-asymptotic (finite\u0000sample) setting. These expressions are notable for avoiding arbitrary\u0000constants, and thus offer robust quantitative information and scaling\u0000relationships. These results are contrasted with those from classical least\u0000squares regression (for which analogous finite sample bounds are also derived),\u0000shedding light on systematic and noise components, as well as optimal step\u0000sizes. Additionally, identities involving high-order products of Gaussian\u0000random matrices are presented as a byproduct of the analysis.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery","authors":"Patrick Saux","doi":"arxiv-2405.01994","DOIUrl":"https://doi.org/arxiv-2405.01994","url":null,"abstract":"This thesis aims to study some of the mathematical challenges that arise in\u0000the analysis of statistical sequential decision-making algorithms for\u0000postoperative patients follow-up. Stochastic bandits (multiarmed, contextual)\u0000model the learning of a sequence of actions (policy) by an agent in an\u0000uncertain environment in order to maximise observed rewards. To learn optimal\u0000policies, bandit algorithms have to balance the exploitation of current\u0000knowledge and the exploration of uncertain actions. Such algorithms have\u0000largely been studied and deployed in industrial applications with large\u0000datasets, low-risk decisions and clear modelling assumptions, such as\u0000clickthrough rate maximisation in online advertising. By contrast, digital\u0000health recommendations call for a whole new paradigm of small samples,\u0000risk-averse agents and complex, nonparametric modelling. To this end, we\u0000developed new safe, anytime-valid concentration bounds, (Bregman, empirical\u0000Chernoff), introduced a new framework for risk-aware contextual bandits (with\u0000elicitable risk measures) and analysed a novel class of nonparametric bandit\u0000algorithms under weak assumptions (Dirichlet sampling). In addition to the\u0000theoretical guarantees, these results are supported by in-depth empirical\u0000evidence. Finally, as a first step towards personalised postoperative follow-up\u0000recommendations, we developed with medical doctors and surgeons an\u0000interpretable machine learning model to predict the long-term weight\u0000trajectories of patients after bariatric surgery.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning","authors":"Nicolas Dewolf","doi":"arxiv-2405.02082","DOIUrl":"https://doi.org/arxiv-2405.02082","url":null,"abstract":"In the past decades, most work in the area of data analysis and machine\u0000learning was focused on optimizing predictive models and getting better results\u0000than what was possible with existing models. To what extent the metrics with\u0000which such improvements were measured were accurately capturing the intended\u0000goal, whether the numerical differences in the resulting values were\u0000significant, or whether uncertainty played a role in this study and if it\u0000should have been taken into account, was of secondary importance. Whereas\u0000probability theory, be it frequentist or Bayesian, used to be the gold standard\u0000in science before the advent of the supercomputer, it was quickly replaced in\u0000favor of black box models and sheer computing power because of their ability to\u0000handle large data sets. This evolution sadly happened at the expense of\u0000interpretability and trustworthiness. However, while people are still trying to\u0000improve the predictive power of their models, the community is starting to\u0000realize that for many applications it is not so much the exact prediction that\u0000is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where\u0000everyone is aware of uncertainty, of how important it is and how to embrace it\u0000instead of fearing it. A specific, though general, framework that allows anyone\u0000to obtain accurate uncertainty estimates is singled out and analysed. Certain\u0000aspects and applications of the framework -- dubbed `conformal prediction' --\u0000are studied in detail. Whereas many approaches to uncertainty quantification\u0000make strong assumptions about the data, conformal prediction is, at the time of\u0000writing, the only framework that deserves the title `distribution-free'. No\u0000parametric assumptions have to be made and the nonparametric results also hold\u0000without having to resort to the law of large numbers in the asymptotic regime.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Gapeev-Shiryaev Conjecture","authors":"Philip A. Ernst, Goran Peskir","doi":"arxiv-2405.01685","DOIUrl":"https://doi.org/arxiv-2405.01685","url":null,"abstract":"The Gapeev-Shiryaev conjecture (originating in Gapeev and Shiryaev (2011) and\u0000Gapeev and Shiryaev (2013)) can be broadly stated as follows: Monotonicity of\u0000the signal-to-noise ratio implies monotonicity of the optimal stopping\u0000boundaries. The conjecture was originally formulated both within (i) sequential\u0000testing problems for diffusion processes (where one needs to decide which of\u0000the two drifts is being indirectly observed) and (ii) quickest detection\u0000problems for diffusion processes (where one needs to detect when the initial\u0000drift changes to a new drift). In this paper we present proofs of the\u0000Gapeev-Shiryaev conjecture both in (i) the sequential testing setting (under\u0000Lipschitz/Holder coefficients of the underlying SDEs) and (ii) the quickest\u0000detection setting (under analytic coefficients of the underlying SDEs). The\u0000method of proof in the sequential testing setting relies upon a stochastic time\u0000change and pathwise comparison arguments. Both arguments break down in the\u0000quickest detection setting and get replaced by arguments arising from a\u0000stochastic maximum principle for hypoelliptic equations (satisfying Hormander's\u0000condition) that is of independent interest. Verification of the Gapeev-Shiryaev\u0000conjecture establishes the fact that sequential testing and quickest detection\u0000problems with monotone signal-to-noise ratios are amenable to known methods of\u0000solution.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu
{"title":"Minimax Regret Learning for Data with Heterogeneous Subgroups","authors":"Weibin Mo, Weijing Tang, Songkai Xue, Yufeng Liu, Ji Zhu","doi":"arxiv-2405.01709","DOIUrl":"https://doi.org/arxiv-2405.01709","url":null,"abstract":"Modern complex datasets often consist of various sub-populations. To develop\u0000robust and generalizable methods in the presence of sub-population\u0000heterogeneity, it is important to guarantee a uniform learning performance\u0000instead of an average one. In many applications, prior information is often\u0000available on which sub-population or group the data points belong to. Given the\u0000observed groups of data, we develop a min-max-regret (MMR) learning framework\u0000for general supervised learning, which targets to minimize the worst-group\u0000regret. Motivated from the regret-based decision theoretic framework, the\u0000proposed MMR is distinguished from the value-based or risk-based robust\u0000learning methods in the existing literature. The regret criterion features\u0000several robustness and invariance properties simultaneously. In terms of\u0000generalizability, we develop the theoretical guarantee for the worst-case\u0000regret over a super-population of the meta data, which incorporates the\u0000observed sub-populations, their mixtures, as well as other unseen\u0000sub-populations that could be approximated by the observed ones. We demonstrate\u0000the effectiveness of our method through extensive simulation studies and an\u0000application to kidney transplantation data from hundreds of transplant centers.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"152 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Inference for Estimating Heat Sources through Temperature Assimilation","authors":"Hanieh Mousavi, Jeff D. Eldredge","doi":"arxiv-2405.02319","DOIUrl":"https://doi.org/arxiv-2405.02319","url":null,"abstract":"This paper introduces a Bayesian inference framework for two-dimensional\u0000steady-state heat conduction, focusing on the estimation of unknown distributed\u0000heat sources in a thermally-conducting medium with uniform conductivity. The\u0000goal is to infer heater locations, strengths, and shapes using temperature\u0000assimilation in the Euclidean space, employing a Fourier series to represent\u0000each heater's shape. The Markov Chain Monte Carlo (MCMC) method, incorporating\u0000the random-walk Metropolis-Hasting algorithm and parallel tempering, is\u0000utilized for posterior distribution exploration in both unbounded and\u0000wall-bounded domains. Strong correlations between heat strength and heater area\u0000prompt caution against simultaneously estimating these two quantities. It is\u0000found that multiple solutions arise in cases where the number of temperature\u0000sensors is less than the number of unknown states. Moreover, smaller heaters\u0000introduce greater uncertainty in estimated strength. The diffusive nature of\u0000heat conduction smooths out any deformations in the temperature contours,\u0000especially in the presence of multiple heaters positioned near each other,\u0000impacting convergence. In wall-bounded domains with Neumann boundary\u0000conditions, the inference of heater parameters tends to be more accurate than\u0000in unbounded domains.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}