arXiv - STAT - Machine Learning最新文献_第8页

Federated $mathcal{X}$-armed Bandit with Flexible Personalisation 具有灵活个性化功能的联合 $mathcal{X}$-armed Bandit

arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07251

Ali Arabzadeh, James A. Grant, David S. Leslie

引用次数: 0

Is merging worth it? Securely evaluating the information gain for causal dataset acquisition 合并值得吗？安全评估因果数据集获取的信息增益

arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07215

Jake Fawkes, Lucile Ter-Minassian, Desi Ivanova, Uri Shalit, Chris Holmes

{"title":"Is merging worth it? Securely evaluating the information gain for causal dataset acquisition","authors":"Jake Fawkes, Lucile Ter-Minassian, Desi Ivanova, Uri Shalit, Chris Holmes","doi":"arxiv-2409.07215","DOIUrl":"https://doi.org/arxiv-2409.07215","url":null,"abstract":"Merging datasets across institutions is a lengthy and costly procedure,\u0000especially when it involves private information. Data hosts may therefore want\u0000to prospectively gauge which datasets are most beneficial to merge with,\u0000without revealing sensitive information. For causal estimation this is\u0000particularly challenging as the value of a merge will depend not only on the\u0000reduction in epistemic uncertainty but also the improvement in overlap. To\u0000address this challenge, we introduce the first cryptographically secure\u0000information-theoretic approach for quantifying the value of a merge in the\u0000context of heterogeneous treatment effect estimation. We do this by evaluating\u0000the Expected Information Gain (EIG) and utilising multi-party computation to\u0000ensure it can be securely computed without revealing any raw data. As we\u0000demonstrate, this can be used with differential privacy (DP) to ensure privacy\u0000requirements whilst preserving more accurate computation than naive DP alone.\u0000To the best of our knowledge, this work presents the first privacy-preserving\u0000method for dataset acquisition tailored to causal estimation. We demonstrate\u0000the effectiveness and reliability of our method on a range of simulated and\u0000realistic benchmarks. The code is available anonymously.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weather-Informed Probabilistic Forecasting and Scenario Generation in Power Systems 电力系统中的气象信息概率预测和情景生成

arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07637

Hanyu Zhang, Reza Zandehshahvar, Mathieu Tanneau, Pascal Van Hentenryck

{"title":"Weather-Informed Probabilistic Forecasting and Scenario Generation in Power Systems","authors":"Hanyu Zhang, Reza Zandehshahvar, Mathieu Tanneau, Pascal Van Hentenryck","doi":"arxiv-2409.07637","DOIUrl":"https://doi.org/arxiv-2409.07637","url":null,"abstract":"The integration of renewable energy sources (RES) into power grids presents\u0000significant challenges due to their intrinsic stochasticity and uncertainty,\u0000necessitating the development of new techniques for reliable and efficient\u0000forecasting. This paper proposes a method combining probabilistic forecasting\u0000and Gaussian copula for day-ahead prediction and scenario generation of load,\u0000wind, and solar power in high-dimensional contexts. By incorporating weather\u0000covariates and restoring spatio-temporal correlations, the proposed method\u0000enhances the reliability of probabilistic forecasts in RES. Extensive numerical\u0000experiments compare the effectiveness of different time series models, with\u0000performance evaluated using comprehensive metrics on a real-world and\u0000high-dimensional dataset from Midcontinent Independent System Operator (MISO).\u0000The results highlight the importance of weather information and demonstrate the\u0000efficacy of the Gaussian copula in generating realistic scenarios, with the\u0000proposed weather-informed Temporal Fusion Transformer (WI-TFT) model showing\u0000superior performance.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"183 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Scalable Algorithm for Active Learning 主动学习的可扩展算法

arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07392

Youguang Chen, Zheyu Wen, George Biros

引用次数: 0

Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport 粒子物理学中的稳健半参数信号检测，分类器通过最优传输相互关联

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06399

Purvasha Chakravarti, Lucas Kania, Olaf Behnke, Mikael Kuusela, Larry Wasserman

{"title":"Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport","authors":"Purvasha Chakravarti, Lucas Kania, Olaf Behnke, Mikael Kuusela, Larry Wasserman","doi":"arxiv-2409.06399","DOIUrl":"https://doi.org/arxiv-2409.06399","url":null,"abstract":"Searches of new signals in particle physics are usually done by training a\u0000supervised classifier to separate a signal model from the known Standard Model\u0000physics (also called the background model). However, even when the signal model\u0000is correct, systematic errors in the background model can influence supervised\u0000classifiers and might adversely affect the signal detection procedure. To\u0000tackle this problem, one approach is to use the (possibly misspecified)\u0000classifier only to perform a preliminary signal-enrichment step and then to\u0000carry out a bump hunt on the signal-rich sample using only the real\u0000experimental data. For this procedure to work, we need a classifier constrained\u0000to be decorrelated with one or more protected variables used for the signal\u0000detection step. We do this by considering an optimal transport map of the\u0000classifier output that makes it independent of the protected variable(s) for\u0000the background. We then fit a semi-parametric mixture model to the distribution\u0000of the protected variable after making cuts on the transformed classifier to\u0000detect the presence of a signal. We compare and contrast this decorrelation\u0000method with previous approaches, show that the decorrelation procedure is\u0000robust to moderate background misspecification, and analyse the power of the\u0000signal detection test as a function of the cut on the classifier.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework 分区网络空间几何：统一的理论和计算框架

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06302

Stephen Y Zhang, Fangfei Lan, Youjia Zhou, Agnese Barbensi, Michael P H Stumpf, Bei Wang, Tom Needham

{"title":"Geometry of the Space of Partitioned Networks: A Unified Theoretical and Computational Framework","authors":"Stephen Y Zhang, Fangfei Lan, Youjia Zhou, Agnese Barbensi, Michael P H Stumpf, Bei Wang, Tom Needham","doi":"arxiv-2409.06302","DOIUrl":"https://doi.org/arxiv-2409.06302","url":null,"abstract":"Interactions and relations between objects may be pairwise or higher-order in\u0000nature, and so network-valued data are ubiquitous in the real world. The \"space\u0000of networks\", however, has a complex structure that cannot be adequately\u0000described using conventional statistical tools. We introduce a\u0000measure-theoretic formalism for modeling generalized network structures such as\u0000graphs, hypergraphs, or graphs whose nodes come with a partition into\u0000categorical classes. We then propose a metric that extends the\u0000Gromov-Wasserstein distance between graphs and the co-optimal transport\u0000distance between hypergraphs. We characterize the geometry of this space,\u0000thereby providing a unified theoretical treatment of generalized networks that\u0000encompasses the cases of pairwise, as well as higher-order, relations. In\u0000particular, we show that our metric is an Alexandrov space of non-negative\u0000curvature, and leverage this structure to define gradients for certain\u0000functionals commonly arising in geometric data analysis tasks. We extend our\u0000analysis to the setting where vertices have additional label information, and\u0000derive efficient computational schemes to use in practice. Equipped with these\u0000theoretical and computational tools, we demonstrate the utility of our\u0000framework in a suite of applications, including hypergraph alignment,\u0000clustering and dictionary learning from ensemble data, multi-omics alignment,\u0000as well as multiscale network alignment.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Deep Kernels for Non-Parametric Independence Testing 学习用于非参数独立性检验的深度核

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06890

Nathaniel Xu, Feng Liu, Danica J. Sutherland

引用次数: 0

Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling 使用 $K$ 近邻重采样进行限价订单簿模拟和交易评估

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06514

Michael Giegrich, Roel Oomen, Christoph Reisinger

{"title":"Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling","authors":"Michael Giegrich, Roel Oomen, Christoph Reisinger","doi":"arxiv-2409.06514","DOIUrl":"https://doi.org/arxiv-2409.06514","url":null,"abstract":"In this paper, we show how $K$-nearest neighbor ($K$-NN) resampling, an\u0000off-policy evaluation method proposed in cite{giegrich2023k}, can be applied\u0000to simulate limit order book (LOB) markets and how it can be used to evaluate\u0000and calibrate trading strategies. Using historical LOB data, we demonstrate\u0000that our simulation method is capable of recreating realistic LOB dynamics and\u0000that synthetic trading within the simulation leads to a market impact in line\u0000with the corresponding literature. Compared to other statistical LOB simulation\u0000methods, our algorithm has theoretical convergence guarantees under general\u0000conditions, does not require optimization, is easy to implement and\u0000computationally efficient. Furthermore, we show that in a benchmark comparison\u0000our method outperforms a deep learning-based algorithm for several key\u0000statistics. In the context of a LOB with pro-rata type matching, we demonstrate\u0000how our algorithm can calibrate the size of limit orders for a liquidation\u0000strategy. Finally, we describe how $K$-NN resampling can be modified for\u0000choices of higher dimensional state spaces.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing Causal Inference: A Nonparametric Approach to ATE and CATE Estimation with Continuous Treatments 推进因果推论：利用连续治疗进行 ATE 和 CATE 估算的非参数方法

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06593

Hugo Gobato Souto, Francisco Louzada Neto

引用次数: 0

Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems 函数约束算法解决凸简单双层问题

arXiv - STAT - Machine Learning Pub Date : 2024-09-10 DOI: arxiv-2409.06530

Huaqing Zhang, Lesi Chen, Jing Xu, Jingzhao Zhang

引用次数: 0