{"title":"Exact exponential tail estimation for sums of independent centered random variables, under natural norming, with applications to the theory of U-statistics","authors":"M. R. Formica, E. Ostrovsky, L. Sirota","doi":"arxiv-2409.05083","DOIUrl":"https://doi.org/arxiv-2409.05083","url":null,"abstract":"We derive in this short report the exact exponential decreasing tail of\u0000distribution for naturel normed sums of independent centered random variables\u0000(r.v.), applying the theory of Grand Lebesgue Spaces (GLS). We consider also\u0000some applications into the theory of U statistics, where we deduce alike for\u0000the independent variables the refined exponential tail estimates for ones under\u0000natural norming sequence.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Precise Asymptotics for Linear Mixed Models with Crossed Random Effects","authors":"Jiming Jiang, Matt P. Wand, Swarnadip Ghosh","doi":"arxiv-2409.05066","DOIUrl":"https://doi.org/arxiv-2409.05066","url":null,"abstract":"We obtain an asymptotic normality result that reveals the precise asymptotic\u0000behavior of the maximum likelihood estimators of parameters for a very general\u0000class of linear mixed models containing cross random effects. In achieving the\u0000result, we overcome theoretical difficulties that arise from random effects\u0000being crossed as opposed to the simpler nested random effects case. Our new\u0000theory is for a class of Gaussian response linear mixed models which includes\u0000crossed random slopes that partner arbitrary multivariate predictor effects and\u0000does not require the cell counts to be balanced. Statistical utilities include\u0000confidence interval construction, Wald hypothesis test and sample size\u0000calculations.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Privacy enhanced collaborative inference in the Cox proportional hazards model for distributed data","authors":"Mengtong Hu, Xu Shi, Peter X. -K. Song","doi":"arxiv-2409.04716","DOIUrl":"https://doi.org/arxiv-2409.04716","url":null,"abstract":"Data sharing barriers are paramount challenges arising from multicenter\u0000clinical studies where multiple data sources are stored in a distributed\u0000fashion at different local study sites. Particularly in the case of\u0000time-to-event analysis when global risk sets are needed for the Cox\u0000proportional hazards model, access to a centralized database is typically\u0000necessary. Merging such data sources into a common data storage for a\u0000centralized statistical analysis requires a data use agreement, which is often\u0000time-consuming. Furthermore, the construction and distribution of risk sets to\u0000participating clinical centers for subsequent calculations may pose a risk of\u0000revealing individual-level information. We propose a new collaborative Cox\u0000model that eliminates the need for accessing the centralized database and\u0000constructing global risk sets but needs only the sharing of summary statistics\u0000with significantly smaller dimensions than risk sets. Thus, the proposed\u0000collaborative inference enjoys maximal protection of data privacy. We show\u0000theoretically and numerically that the new distributed proportional hazards\u0000model approach has little loss of statistical power when compared to the\u0000centralized method that requires merging the entire data. We present a\u0000renewable sieve method to establish large-sample properties for the proposed\u0000method. We illustrate its performance through simulation experiments and a\u0000real-world data example from patients with kidney transplantation in the Organ\u0000Procurement and Transplantation Network (OPTN) to understand the factors\u0000associated with the 5-year death-censored graft failure (DCGF) for patients who\u0000underwent kidney transplants in the US.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generative Modelling via Quantile Regression","authors":"Johannes Schmidt-Hieber, Petr Zamolodtchikov","doi":"arxiv-2409.04231","DOIUrl":"https://doi.org/arxiv-2409.04231","url":null,"abstract":"We link conditional generative modelling to quantile regression. We propose a\u0000suitable loss function and derive minimax convergence rates for the associated\u0000risk under smoothness assumptions imposed on the conditional distribution. To\u0000establish the lower bound, we show that nonparametric regression can be seen as\u0000a sub-problem of the considered generative modelling framework. Finally, we\u0000discuss extensions of our work to generate data from multivariate\u0000distributions.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengfu Wei, Jordan Stoyanov, Yiming Chen, Zijun Chen
{"title":"Improved Catoni-Type Confidence Sequences for Estimating the Mean When the Variance Is Infinite","authors":"Chengfu Wei, Jordan Stoyanov, Yiming Chen, Zijun Chen","doi":"arxiv-2409.04198","DOIUrl":"https://doi.org/arxiv-2409.04198","url":null,"abstract":"We consider a discrete time stochastic model with infinite variance and study\u0000the mean estimation problem as in Wang and Ramdas (2023). We refine the\u0000Catoni-type confidence sequence (abbr. CS) and use an idea of Bhatt et al.\u0000(2022) to achieve notable improvements of some currently existing results for\u0000such model. Specifically, for given $alpha in (0, 1]$, we assume that there is a known\u0000upper bound $nu_{alpha} > 0$ for the $(1 + alpha)$-th central moment of the\u0000population distribution that the sample follows. Our findings replicate and\u0000`optimize' results in the above references for $alpha = 1$ (i.e., in models\u0000with finite variance) and enhance the results for $alpha < 1$. Furthermore, by\u0000employing the stitching method, we derive an upper bound on the width of the CS\u0000as $mathcal{O} left(((log log t)/t)^{frac{alpha}{1+alpha}}right)$ for\u0000the shrinking rate as $t$ increases, and $mathcal{O}(left(log\u0000(1/delta)right)^{frac{alpha }{1+alpha}})$ for the growth rate as $delta$\u0000decreases. These bounds are improving upon the bounds found in Wang and Ramdas\u0000(2023). Our theoretical results are illustrated by results from a series of\u0000simulation experiments. Comparing the performance of our improved\u0000$alpha$-Catoni-type CS with the bound in the above cited paper indicates that\u0000our CS achieves tighter width.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Random effects estimation in a fractional diffusion model based on continuous observations","authors":"Nesrine Chebli, Hamdi Fathallah, Yousri Slaoui","doi":"arxiv-2409.04331","DOIUrl":"https://doi.org/arxiv-2409.04331","url":null,"abstract":"The purpose of the present work is to construct estimators for the random\u0000effects in a fractional diffusion model using a hybrid estimation method where\u0000we combine parametric and nonparametric thechniques. We precisely consider $n$\u0000stochastic processes $left{X_t^j, 0leq tleq Tright}$, $j=1,ldots, n$\u0000continuously observed over the time interval $[0,T]$, where the dynamics of\u0000each process are described by fractional stochastic differential equations with\u0000drifts depending on random effects. We first construct a parametric estimator\u0000for the random effects using the techniques of maximum likelihood estimation\u0000and we study its asymptotic properties when the time horizon $T$ is\u0000sufficiently large. Then by taking into account the obtained estimator for the\u0000random effects, we build a nonparametric estimator for their common unknown\u0000density function using Bernstein polynomials approximation. Some asymptotic\u0000properties of the density estimator, such as its asymptotic bias, variance and\u0000mean integrated squared error, are studied for an infinite time horizon $T$ and\u0000a fixed sample size $n$. The asymptotic normality and the uniform convergence\u0000of the estimator are investigated for an infinite time horizon $T$, a high\u0000frequency and as the order of Bernstein polynomials is sufficiently large. Some\u0000numerical simulations are also presented to illustrate the performance of the\u0000Bernstein polynomials based estimator compared to standard Kernel estimator for\u0000the random effects density function.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"140 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of Proportion of Null Hypotheses Under Dependence","authors":"Nabaneet Das","doi":"arxiv-2409.04100","DOIUrl":"https://doi.org/arxiv-2409.04100","url":null,"abstract":"Estimation of the proportion of null hypotheses in a multiple testing problem\u0000can greatly enhance the performance of the existing algorithms. Although\u0000various estimators for the proportion of null hypotheses have been proposed,\u0000most are designed for independent samples, and their effectiveness in dependent\u0000scenarios is not well explored. This article investigates the asymptotic\u0000behavior of the BH estimator and evaluates its performance across different\u0000types of dependence. Additionally, we assess Storey's estimator and another\u0000estimator proposed by Patra and Sen (2016) to understand their effectiveness in\u0000these settings.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Fidelity Estimation from Binary Measurements for Discrete and Continuous Variable Systems","authors":"Omar Fawzi, Aadil Oufkir, Robert Salzmann","doi":"arxiv-2409.04189","DOIUrl":"https://doi.org/arxiv-2409.04189","url":null,"abstract":"Estimating the fidelity between a desired target quantum state and an actual\u0000prepared state is essential for assessing the success of experiments. For pure\u0000target states, we use functional representations that can be measured directly\u0000and determine the number of copies of the prepared state needed for fidelity\u0000estimation. In continuous variable (CV) systems, we utilise the Wigner\u0000function, which can be measured via displaced parity measurements. We provide\u0000upper and lower bounds on the sample complexity required for fidelity\u0000estimation, considering the worst-case scenario across all possible prepared\u0000states. For target states of particular interest, such as Fock and Gaussian\u0000states, we find that this sample complexity is characterised by the $L^1$-norm\u0000of the Wigner function, a measure of Wigner negativity widely studied in the\u0000literature, in particular in resource theories of quantum computation. For\u0000discrete variable systems consisting of $n$ qubits, we explore fidelity\u0000estimation protocols using Pauli string measurements. Similarly to the CV\u0000approach, the sample complexity is shown to be characterised by the $L^1$-norm\u0000of the characteristic function of the target state for both Haar random states\u0000and stabiliser states. Furthermore, in a general black box model, we prove\u0000that, for any target state, the optimal sample complexity for fidelity\u0000estimation is characterised by the smoothed $L^1$-norm of the target state. To\u0000the best of our knowledge, this is the first time the $L^1$-norm of the Wigner\u0000function provides a lower bound on the cost of some information processing\u0000task.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of service value parameters for a queue with unobserved balking","authors":"Daniel Podorojnyi, Liron Ravner","doi":"arxiv-2409.04090","DOIUrl":"https://doi.org/arxiv-2409.04090","url":null,"abstract":"In Naor's model [16], customers decide whether or not to join a queue after\u0000observing its length. We suppose that customers are heterogeneous in their\u0000service value (reward) $R$ from completed service and homogeneous in the cost\u0000of staying in the system per unit of time. It is assumed that the values of\u0000customers are independent random variables generated from a common parametric\u0000distribution. The manager observes the queue length process, but not the\u0000balking customers. Based on the queue length data, an MLE is constructed for\u0000the underlying parameters of $R$. We provide verifiable conditions for which\u0000the estimator is consistent and asymptotically normal. A dynamic pricing scheme\u0000is constructed that starts from some arbitrary price and iteratively updates\u0000the price using the estimated parameters. The performance of the estimator and\u0000the pricing algorithm are studied through a series of simulation experiments.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Didier HenrionLAAS-POP, Jean Bernard LasserreLAAS-POP, TSE-R
{"title":"Approximate D-optimal design and equilibrium measure *","authors":"Didier HenrionLAAS-POP, Jean Bernard LasserreLAAS-POP, TSE-R","doi":"arxiv-2409.04058","DOIUrl":"https://doi.org/arxiv-2409.04058","url":null,"abstract":"We introduce a variant of the D-optimal design of experiments problem with a\u0000more general information matrix that takes into account the representation of\u0000the design space S. The main motivation is that if S $subset$ R d is the unit\u0000ball, the unit box or the canonical simplex, then remarkably, for every\u0000dimension d and every degree n, the equilibrium measure of S (in pluripotential\u0000theory) is an optimal solution. Equivalently, for each degree n, the unique\u0000optimal solution is the vector of moments (up to degree 2n) of the equilibrium\u0000measure of S. Hence nding an optimal design reduces to nding a cubature for the\u0000equilibrium measure, with atoms in S, positive weights, and exact up to degree\u00002n. In addition, any resulting sequence of atomic D-optimal measures converges\u0000to the equilibrium measure of S for the weak-star topology, as n increases.\u0000Links with Fekete sets of points are also discussed. More general compact basic\u0000semialgebraic sets are also considered, and a previously developed two-step\u0000design algorithm is easily adapted to this new variant of D-optimal design\u0000problem.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"396 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}