{"title":"Convergence of opinions","authors":"Vladimir Vovk","doi":"arxiv-2312.02033","DOIUrl":"https://doi.org/arxiv-2312.02033","url":null,"abstract":"This paper establishes a game-theoretic version of the classical\u0000Blackwell-Dubins result. We consider two forecasters who at each step issue\u0000probability forecasts for the infinite future. Our result says that either at\u0000least one of the two forecasters will be discredited or their forecasts will\u0000converge in total variation.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"88 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cone Ranking for Multi-Criteria Decision Making","authors":"Andreas H Hamel, Daniel Kostner","doi":"arxiv-2312.03006","DOIUrl":"https://doi.org/arxiv-2312.03006","url":null,"abstract":"Recently introduced cone distribution functions from statistics are turned\u0000into multi-criteria decision making (MCDM) tools. It is demonstrated that this\u0000procedure can be considered as an upgrade of the weighted sum scalarization\u0000insofar as it absorbs a whole collection of weighted sum scalarizations at once\u0000instead of fixing a particular one in advance. Moreover, situations are\u0000characterized in which different types of rank reversal occur, and it is\u0000explained why this might even be useful for analyzing the ranking procedure. A\u0000few examples will be discussed and a potential application in machine learning\u0000is outlined.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"485 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138546744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unified framework for covariate adjustment under stratified randomization","authors":"Fuyi Tu, Wei Ma, Hanzhong Liu","doi":"arxiv-2312.01266","DOIUrl":"https://doi.org/arxiv-2312.01266","url":null,"abstract":"Randomization, as a key technique in clinical trials, can eliminate sources\u0000of bias and produce comparable treatment groups. In randomized experiments, the\u0000treatment effect is a parameter of general interest. Researchers have explored\u0000the validity of using linear models to estimate the treatment effect and\u0000perform covariate adjustment and thus improve the estimation efficiency.\u0000However, the relationship between covariates and outcomes is not necessarily\u0000linear, and is often intricate. Advances in statistical theory and related\u0000computer technology allow us to use nonparametric and machine learning methods\u0000to better estimate the relationship between covariates and outcomes and thus\u0000obtain further efficiency gains. However, theoretical studies on how to draw\u0000valid inferences when using nonparametric and machine learning methods under\u0000stratified randomization are yet to be conducted. In this paper, we discuss a\u0000unified framework for covariate adjustment and corresponding statistical\u0000inference under stratified randomization and present a detailed proof of the\u0000validity of using local linear kernel-weighted least squares regression for\u0000covariate adjustment in treatment effect estimators as a special case. In the\u0000case of high-dimensional data, we additionally propose an algorithm for\u0000statistical inference using machine learning methods under stratified\u0000randomization, which makes use of sample splitting to alleviate the\u0000requirements on the asymptotic properties of machine learning methods. Finally,\u0000we compare the performances of treatment effect estimators using different\u0000machine learning methods by considering various data generation scenarios, to\u0000guide practical research.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"83 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the admissibility of Horvitz-Thompson estimator for estimating causal effects under network interference","authors":"Vishesh Karwa, Edoardo M. Airoldi","doi":"arxiv-2312.01234","DOIUrl":"https://doi.org/arxiv-2312.01234","url":null,"abstract":"The Horvitz-Thompson (H-T) estimator is widely used for estimating various\u0000types of average treatment effects under network interference. We\u0000systematically investigate the optimality properties of H-T estimator under\u0000network interference, by embedding it in the class of all linear estimators. In\u0000particular, we show that in presence of any kind of network interference, H-T\u0000estimator is in-admissible in the class of all linear estimators when using a\u0000completely randomized and a Bernoulli design. We also show that the H-T\u0000estimator becomes admissible under certain restricted randomization schemes\u0000termed as ``fixed exposure designs''. We give examples of such fixed exposure\u0000designs. It is well known that the H-T estimator is unbiased when correct\u0000weights are specified. Here, we derive the weights for unbiased estimation of\u0000various causal effects, and illustrate how they depend not only on the design,\u0000but more importantly, on the assumed form of interference (which in many real\u0000world situations is unknown at design stage), and the causal effect of\u0000interest.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"87 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bagged Regularized $k$-Distances for Anomaly Detection","authors":"Yuchao Cai, Yuheng Ma, Hanfang Yang, Hanyuan Hang","doi":"arxiv-2312.01046","DOIUrl":"https://doi.org/arxiv-2312.01046","url":null,"abstract":"We consider the paradigm of unsupervised anomaly detection, which involves\u0000the identification of anomalies within a dataset in the absence of labeled\u0000examples. Though distance-based methods are top-performing for unsupervised\u0000anomaly detection, they suffer heavily from the sensitivity to the choice of\u0000the number of the nearest neighbors. In this paper, we propose a new\u0000distance-based algorithm called bagged regularized $k$-distances for anomaly\u0000detection (BRDAD) converting the unsupervised anomaly detection problem into a\u0000convex optimization problem. Our BRDAD algorithm selects the weights by\u0000minimizing the surrogate risk, i.e., the finite sample bound of the empirical\u0000risk of the bagged weighted $k$-distances for density estimation (BWDDE). This\u0000approach enables us to successfully address the sensitivity challenge of the\u0000hyperparameter choice in distance-based algorithms. Moreover, when dealing with\u0000large-scale datasets, the efficiency issues can be addressed by the\u0000incorporated bagging technique in our BRDAD algorithm. On the theoretical side,\u0000we establish fast convergence rates of the AUC regret of our algorithm and\u0000demonstrate that the bagging technique significantly reduces the computational\u0000complexity. On the practical side, we conduct numerical experiments on anomaly\u0000detection benchmarks to illustrate the insensitivity of parameter selection of\u0000our algorithm compared with other state-of-the-art distance-based methods.\u0000Moreover, promising improvements are brought by applying the bagging technique\u0000in our algorithm on real-world datasets.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"88 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification and Inference for Synthetic Controls with Confounding","authors":"Guido W. Imbens, Davide Viviano","doi":"arxiv-2312.00955","DOIUrl":"https://doi.org/arxiv-2312.00955","url":null,"abstract":"This paper studies inference on treatment effects in panel data settings with\u0000unobserved confounding. We model outcome variables through a factor model with\u0000random factors and loadings. Such factors and loadings may act as unobserved\u0000confounders: when the treatment is implemented depends on time-varying factors,\u0000and who receives the treatment depends on unit-level confounders. We study the\u0000identification of treatment effects and illustrate the presence of a trade-off\u0000between time and unit-level confounding. We provide asymptotic results for\u0000inference for several Synthetic Control estimators and show that different\u0000sources of randomness should be considered for inference, depending on the\u0000nature of confounding. We conclude with a comparison of Synthetic Control\u0000estimators with alternatives for factor models.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"93 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Morten Ørregaard Nielsen, Won-Ki Seo, Dakyung Seong
{"title":"Inference on common trends in functional time series","authors":"Morten Ørregaard Nielsen, Won-Ki Seo, Dakyung Seong","doi":"arxiv-2312.00590","DOIUrl":"https://doi.org/arxiv-2312.00590","url":null,"abstract":"This paper studies statistical inference on unit roots and cointegration for\u0000time series in a Hilbert space. We develop statistical inference on the number\u0000of common stochastic trends that are embedded in the time series, i.e., the\u0000dimension of the nonstationary subspace. We also consider hypotheses on the\u0000nonstationary subspace itself. The Hilbert space can be of an arbitrarily large\u0000dimension, and our methods remain asymptotically valid even when the time\u0000series of interest takes values in a subspace of possibly unknown dimension.\u0000This has wide applicability in practice; for example, in the case of\u0000cointegrated vector time series of finite dimension, in a high-dimensional\u0000factor model that includes a finite number of nonstationary factors, in the\u0000case of cointegrated curve-valued (or function-valued) time series, and\u0000nonstationary dynamic functional factor models. We include two empirical\u0000illustrations to the term structure of interest rates and labor market indices,\u0000respectively.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"93 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Testing of Linear Forms for Noisy Matrix Completion","authors":"Wanteng Ma, Lilun Du, Dong Xia, Ming Yuan","doi":"arxiv-2312.00305","DOIUrl":"https://doi.org/arxiv-2312.00305","url":null,"abstract":"Many important tasks of large-scale recommender systems can be naturally cast\u0000as testing multiple linear forms for noisy matrix completion. These problems,\u0000however, present unique challenges because of the subtle bias-and-variance\u0000tradeoff of and an intricate dependence among the estimated entries induced by\u0000the low-rank structure. In this paper, we develop a general approach to\u0000overcome these difficulties by introducing new statistics for individual tests\u0000with sharp asymptotics both marginally and jointly, and utilizing them to\u0000control the false discovery rate (FDR) via a data splitting and symmetric\u0000aggregation scheme. We show that valid FDR control can be achieved with\u0000guaranteed power under nearly optimal sample size requirements using the\u0000proposed methodology. Extensive numerical simulations and real data examples\u0000are also presented to further illustrate its practical merits.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"88 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting the two-sample location shift model with a log-concavity assumption","authors":"Ridhiman Saha, Priyam Das, Nilanjana Laha","doi":"arxiv-2311.18277","DOIUrl":"https://doi.org/arxiv-2311.18277","url":null,"abstract":"In this paper, we consider the two-sample location shift model, a classic\u0000semiparametric model introduced by Stein (1956). This model is known for its\u0000adaptive nature, enabling nonparametric estimation with full parametric\u0000efficiency. Existing nonparametric estimators of the location shift often\u0000depend on external tuning parameters, which restricts their practical\u0000applicability (Van der Vaart and Wellner, 2021). We demonstrate that\u0000introducing an additional assumption of log-concavity on the underlying density\u0000can alleviate the need for tuning parameters. We propose a one step estimator\u0000for location shift estimation, utilizing log-concave density estimation\u0000techniques to facilitate tuning-free estimation of the efficient influence\u0000function. While we employ a truncated version of the one step estimator for\u0000theoretical adaptivity, our simulations indicate that the one step estimators\u0000perform best with zero truncation, eliminating the need for tuning during\u0000practical implementation.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"85 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotic Efficiency for Fractional Brownian Motion with general noise","authors":"Grégoire Szymanski, Tetsuya Takabatake","doi":"arxiv-2311.18669","DOIUrl":"https://doi.org/arxiv-2311.18669","url":null,"abstract":"We investigate the Local Asymptotic Property for fractional Brownian models\u0000based on discrete observations contaminated by a Gaussian moving average\u0000process. We consider both situations of low and high-frequency observations in\u0000a unified setup and we show that the convergence rate $n^{1/2} (nu_n\u0000Delta_n^{-H})^{-1/(2H+2K+1)}$ is optimal for estimating the Hurst index $H$,\u0000where $nu_n$ is the noise intensity, $Delta_n$ is the sampling frequency and\u0000$K$ is the moving average order. We also derive asymptotically efficient\u0000variances and we build an estimator achieving this convergence rate and\u0000variance. This theoretical analysis is backed up by a comprehensive numerical\u0000analysis of the estimation procedure that illustrates in particular its\u0000effectiveness for finite samples.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"84 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138521328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}