arXiv - STAT - Machine Learning最新文献_第2页

The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem 平滑提升的采样复杂性与硬核定理的严密性

arXiv - STAT - Machine Learning Pub Date : 2024-09-17 DOI: arxiv-2409.11597

Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan

{"title":"The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem","authors":"Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan","doi":"arxiv-2409.11597","DOIUrl":"https://doi.org/arxiv-2409.11597","url":null,"abstract":"Smooth boosters generate distributions that do not place too much weight on\u0000any given example. Originally introduced for their noise-tolerant properties,\u0000such boosters have also found applications in differential privacy,\u0000reproducibility, and quantum learning theory. We study and settle the sample\u0000complexity of smooth boosting: we exhibit a class that can be weak learned to\u0000$gamma$-advantage over smooth distributions with $m$ samples, for which strong\u0000learning over the uniform distribution requires\u0000$tilde{Omega}(1/gamma^2)cdot m$ samples. This matches the overhead of\u0000existing smooth boosters and provides the first separation from the setting of\u0000distribution-independent boosting, for which the corresponding overhead is\u0000$O(1/gamma)$. Our work also sheds new light on Impagliazzo's hardcore theorem from\u0000complexity theory, all known proofs of which can be cast in the framework of\u0000smooth boosting. For a function $f$ that is mildly hard against size-$s$\u0000circuits, the hardcore theorem provides a set of inputs on which $f$ is\u0000extremely hard against size-$s'$ circuits. A downside of this important result\u0000is the loss in circuit size, i.e. that $s' ll s$. Answering a question of\u0000Trevisan, we show that this size loss is necessary and in fact, the parameters\u0000achieved by known proofs are the best possible.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier 分数奈维贝叶 (FNB)：针对简明加权选择性奈维贝叶分类器的非凸优化

arXiv - STAT - Machine Learning Pub Date : 2024-09-17 DOI: arxiv-2409.11100

Carine Hue, Marc Boullé

{"title":"Fractional Naive Bayes (FNB): non-convex optimization for a parsimonious weighted selective naive Bayes classifier","authors":"Carine Hue, Marc Boullé","doi":"arxiv-2409.11100","DOIUrl":"https://doi.org/arxiv-2409.11100","url":null,"abstract":"We study supervised classification for datasets with a very large number of\u0000input variables. The na\"ive Bayes classifier is attractive for its simplicity,\u0000scalability and effectiveness in many real data applications. When the strong\u0000na\"ive Bayes assumption of conditional independence of the input variables\u0000given the target variable is not valid, variable selection and model averaging\u0000are two common ways to improve the performance. In the case of the na\"ive\u0000Bayes classifier, the resulting weighting scheme on the models reduces to a\u0000weighting scheme on the variables. Here we focus on direct estimation of\u0000variable weights in such a weighted na\"ive Bayes classifier. We propose a\u0000sparse regularization of the model log-likelihood, which takes into account\u0000prior penalization costs related to each input variable. Compared to averaging\u0000based classifiers used up until now, our main goal is to obtain parsimonious\u0000robust models with less variables and equivalent performance. The direct\u0000estimation of the variable weights amounts to a non-convex optimization problem\u0000for which we propose and compare several two-stage algorithms. First, the\u0000criterion obtained by convex relaxation is minimized using several variants of\u0000standard gradient methods. Then, the initial non-convex optimization problem is\u0000solved using local optimization methods initialized with the result of the\u0000first stage. The various proposed algorithms result in optimization-based\u0000weighted na\"ive Bayes classifiers, that are evaluated on benchmark datasets\u0000and positioned w.r.t. to a reference averaging-based classifier.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"94 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Partially Observable Contextual Bandits with Linear Payoffs 线性报酬的部分可观测情境强盗游戏

arXiv - STAT - Machine Learning Pub Date : 2024-09-17 DOI: arxiv-2409.11521

Sihan Zeng, Sujay Bhatt, Alec Koppel, Sumitra Ganesh

引用次数: 0

Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons 低数据模式下的科尔莫戈罗夫-阿诺德网络：与多层感知器的比较研究

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10463

Farhad Pourkamali-Anaraki

{"title":"Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons","authors":"Farhad Pourkamali-Anaraki","doi":"arxiv-2409.10463","DOIUrl":"https://doi.org/arxiv-2409.10463","url":null,"abstract":"Multilayer Perceptrons (MLPs) have long been a cornerstone in deep learning,\u0000known for their capacity to model complex relationships. Recently,\u0000Kolmogorov-Arnold Networks (KANs) have emerged as a compelling alternative,\u0000utilizing highly flexible learnable activation functions directly on network\u0000edges, a departure from the neuron-centric approach of MLPs. However, KANs\u0000significantly increase the number of learnable parameters, raising concerns\u0000about their effectiveness in data-scarce environments. This paper presents a\u0000comprehensive comparative study of MLPs and KANs from both algorithmic and\u0000experimental perspectives, with a focus on low-data regimes. We introduce an\u0000effective technique for designing MLPs with unique, parameterized activation\u0000functions for each neuron, enabling a more balanced comparison with KANs. Using\u0000empirical evaluations on simulated data and two real-world data sets from\u0000medicine and engineering, we explore the trade-offs between model complexity\u0000and accuracy, with particular attention to the role of network depth. Our\u0000findings show that MLPs with individualized activation functions achieve\u0000significantly higher predictive accuracy with only a modest increase in\u0000parameters, especially when the sample size is limited to around one hundred.\u0000For example, in a three-class classification problem within additive\u0000manufacturing, MLPs achieve a median accuracy of 0.91, significantly\u0000outperforming KANs, which only reach a median accuracy of 0.53 with default\u0000hyperparameters. These results offer valuable insights into the impact of\u0000activation function selection in neural networks.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multidimensional Deconvolution with Profiling 多维解卷积与轮廓分析

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10421

Huanbiao Zhu, Krish Desai, Mikael Kuusela, Vinicius Mikuni, Benjamin Nachman, Larry Wasserman

引用次数: 0

Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees 具有强收敛保证的确定性约束随机非凸优化的方差缩小一阶方法

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.09906

Zhaosong Lu, Sanyou Mei, Yifeng Xiao

{"title":"Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees","authors":"Zhaosong Lu, Sanyou Mei, Yifeng Xiao","doi":"arxiv-2409.09906","DOIUrl":"https://doi.org/arxiv-2409.09906","url":null,"abstract":"In this paper, we study a class of deterministically constrained stochastic\u0000optimization problems. Existing methods typically aim to find an\u0000$epsilon$-stochastic stationary point, where the expected violations of both\u0000the constraints and first-order stationarity are within a prescribed accuracy\u0000of $epsilon$. However, in many practical applications, it is crucial that the\u0000constraints be nearly satisfied with certainty, making such an\u0000$epsilon$-stochastic stationary point potentially undesirable due to the risk\u0000of significant constraint violations. To address this issue, we propose\u0000single-loop variance-reduced stochastic first-order methods, where the\u0000stochastic gradient of the stochastic component is computed using either a\u0000truncated recursive momentum scheme or a truncated Polyak momentum scheme for\u0000variance reduction, while the gradient of the deterministic component is\u0000computed exactly. Under the error bound condition with a parameter $theta geq\u00001$ and other suitable assumptions, we establish that the proposed methods\u0000achieve a sample complexity and first-order operation complexity of $widetilde\u0000O(epsilon^{-max{4, 2theta}})$ for finding a stronger $epsilon$-stochastic\u0000stationary point, where the constraint violation is within $epsilon$ with\u0000certainty, and the expected violation of first-order stationarity is within\u0000$epsilon$. To the best of our knowledge, this is the first work to develop\u0000methods with provable complexity guarantees for finding an approximate\u0000stochastic stationary point of such problems that nearly satisfies all\u0000constraints with certainty.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Reinforcement Learning with Dynamic Distortion Risk Measures 利用动态失真风险度量进行稳健强化学习

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10096

Anthony Coache, Sebastian Jaimungal

引用次数: 0

Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design 基于结构的药物设计的显式约束核级去噪扩散模型

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10584

Shengchao Liu, Divin Yan, Weitao Du, Weiyang Liu, Zhuoxinran Li, Hongyu Guo, Christian Borgs, Jennifer Chayes, Anima Anandkumar

{"title":"Manifold-Constrained Nucleus-Level Denoising Diffusion Model for Structure-Based Drug Design","authors":"Shengchao Liu, Divin Yan, Weitao Du, Weiyang Liu, Zhuoxinran Li, Hongyu Guo, Christian Borgs, Jennifer Chayes, Anima Anandkumar","doi":"arxiv-2409.10584","DOIUrl":"https://doi.org/arxiv-2409.10584","url":null,"abstract":"Artificial intelligence models have shown great potential in structure-based\u0000drug design, generating ligands with high binding affinities. However, existing\u0000models have often overlooked a crucial physical constraint: atoms must maintain\u0000a minimum pairwise distance to avoid separation violation, a phenomenon\u0000governed by the balance of attractive and repulsive forces. To mitigate such\u0000separation violations, we propose NucleusDiff. It models the interactions\u0000between atomic nuclei and their surrounding electron clouds by enforcing the\u0000distance constraint between the nuclei and manifolds. We quantitatively\u0000evaluate NucleusDiff using the CrossDocked2020 dataset and a COVID-19\u0000therapeutic target, demonstrating that NucleusDiff reduces violation rate by up\u0000to 100.00% and enhances binding affinity by up to 22.16%, surpassing\u0000state-of-the-art models for structure-based drug design. We also provide\u0000qualitative analysis through manifold sampling, visually confirming the\u0000effectiveness of NucleusDiff in reducing separation violations and improving\u0000binding affinities.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Bayesian Interpretation of Adaptive Low-Rank Adaptation 贝叶斯对自适应低函数适应的解释

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10673

Haolin Chen, Philip N. Garner

引用次数: 0

Spatiotemporal Covariance Neural Networks 时空协方差神经网络

arXiv - STAT - Machine Learning Pub Date : 2024-09-16 DOI: arxiv-2409.10068

Andrea Cavallo, Mohammad Sabbaqi, Elvin Isufi

{"title":"Spatiotemporal Covariance Neural Networks","authors":"Andrea Cavallo, Mohammad Sabbaqi, Elvin Isufi","doi":"arxiv-2409.10068","DOIUrl":"https://doi.org/arxiv-2409.10068","url":null,"abstract":"Modeling spatiotemporal interactions in multivariate time series is key to\u0000their effective processing, but challenging because of their irregular and\u0000often unknown structure. Statistical properties of the data provide useful\u0000biases to model interdependencies and are leveraged by correlation and\u0000covariance-based networks as well as by processing pipelines relying on\u0000principal component analysis (PCA). However, PCA and its temporal extensions\u0000suffer instabilities in the covariance eigenvectors when the corresponding\u0000eigenvalues are close to each other, making their application to dynamic and\u0000streaming data settings challenging. To address these issues, we exploit the\u0000analogy between PCA and graph convolutional filters to introduce the\u0000SpatioTemporal coVariance Neural Network (STVNN), a relational learning model\u0000that operates on the sample covariance matrix of the time series and leverages\u0000joint spatiotemporal convolutions to model the data. To account for the\u0000streaming and non-stationary setting, we consider an online update of the\u0000parameters and sample covariance matrix. We prove the STVNN is stable to the\u0000uncertainties introduced by these online estimations, thus improving over\u0000temporal PCA-based methods. Experimental results corroborate our theoretical\u0000findings and show that STVNN is competitive for multivariate time series\u0000processing, it adapts to changes in the data distribution, and it is orders of\u0000magnitude more stable than online temporal PCA.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0