arXiv - STAT - Machine Learning最新文献

筛选
英文 中文
Automated Discovery of Pairwise Interactions from Unstructured Data 从非结构化数据中自动发现配对交互作用
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07594
ZuhengDavid, Xu, Moksh Jain, Ali Denton, Shawn Whitfield, Aniket Didolkar, Berton Earnshaw, Jason Hartford
{"title":"Automated Discovery of Pairwise Interactions from Unstructured Data","authors":"ZuhengDavid, Xu, Moksh Jain, Ali Denton, Shawn Whitfield, Aniket Didolkar, Berton Earnshaw, Jason Hartford","doi":"arxiv-2409.07594","DOIUrl":"https://doi.org/arxiv-2409.07594","url":null,"abstract":"Pairwise interactions between perturbations to a system can provide evidence\u0000for the causal dependencies of the underlying underlying mechanisms of a\u0000system. When observations are low dimensional, hand crafted measurements,\u0000detecting interactions amounts to simple statistical tests, but it is not\u0000obvious how to detect interactions between perturbations affecting latent\u0000variables. We derive two interaction tests that are based on pairwise\u0000interventions, and show how these tests can be integrated into an active\u0000learning pipeline to efficiently discover pairwise interactions between\u0000perturbations. We illustrate the value of these tests in the context of\u0000biology, where pairwise perturbation experiments are frequently used to reveal\u0000interactions that are not observable from any single perturbation. Our tests\u0000can be run on unstructured data, such as the pixels in an image, which enables\u0000a more general notion of interaction than typical cell viability experiments,\u0000and can be run on cheaper experimental assays. We validate on several synthetic\u0000and real biological experiments that our tests are able to identify interacting\u0000pairs effectively. We evaluate our approach on a real biological experiment\u0000where we knocked out 50 pairs of genes and measured the effect with microscopy\u0000images. We show that we are able to recover significantly more known biological\u0000interactions than random search and standard active learning baselines.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence of continuous-time stochastic gradient descent with applications to linear deep neural networks 连续时间随机梯度下降的收敛性及其在线性深度神经网络中的应用
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07401
Gabor Lugosi, Eulalia Nualart
{"title":"Convergence of continuous-time stochastic gradient descent with applications to linear deep neural networks","authors":"Gabor Lugosi, Eulalia Nualart","doi":"arxiv-2409.07401","DOIUrl":"https://doi.org/arxiv-2409.07401","url":null,"abstract":"We study a continuous-time approximation of the stochastic gradient descent\u0000process for minimizing the expected loss in learning problems. The main results\u0000establish general sufficient conditions for the convergence, extending the\u0000results of Chatterjee (2022) established for (nonstochastic) gradient descent.\u0000We show how the main result can be applied to the case of overparametrized\u0000linear neural network training.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring User-level Gradient Inversion with a Diffusion Prior 利用扩散先验探索用户级梯度反演
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07291
Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Bradley Malin, Kieran Parsons, Ye Wang
{"title":"Exploring User-level Gradient Inversion with a Diffusion Prior","authors":"Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Bradley Malin, Kieran Parsons, Ye Wang","doi":"arxiv-2409.07291","DOIUrl":"https://doi.org/arxiv-2409.07291","url":null,"abstract":"We explore user-level gradient inversion as a new attack surface in\u0000distributed learning. We first investigate existing attacks on their ability to\u0000make inferences about private information beyond training data reconstruction.\u0000Motivated by the low reconstruction quality of existing methods, we propose a\u0000novel gradient inversion attack that applies a denoising diffusion model as a\u0000strong image prior in order to enhance recovery in the large batch setting.\u0000Unlike traditional attacks, which aim to reconstruct individual samples and\u0000suffer at large batch and image sizes, our approach instead aims to recover a\u0000representative image that captures the sensitive shared semantic information\u0000corresponding to the underlying user. Our experiments with face images\u0000demonstrate the ability of our methods to recover realistic facial images along\u0000with private user attributes.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tuning-Free Online Robust Principal Component Analysis through Implicit Regularization 通过隐含正则化实现无调整在线稳健主成分分析
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07275
Lakshmi Jayalal, Gokularam Muthukrishnan, Sheetal Kalyani
{"title":"Tuning-Free Online Robust Principal Component Analysis through Implicit Regularization","authors":"Lakshmi Jayalal, Gokularam Muthukrishnan, Sheetal Kalyani","doi":"arxiv-2409.07275","DOIUrl":"https://doi.org/arxiv-2409.07275","url":null,"abstract":"The performance of the standard Online Robust Principal Component Analysis\u0000(OR-PCA) technique depends on the optimum tuning of the explicit regularizers\u0000and this tuning is dataset sensitive. We aim to remove the dependency on these\u0000tuning parameters by using implicit regularization. We propose to use the\u0000implicit regularization effect of various modified gradient descents to make\u0000OR-PCA tuning free. Our method incorporates three different versions of\u0000modified gradient descent that separately but naturally encourage sparsity and\u0000low-rank structures in the data. The proposed method performs comparable or\u0000better than the tuned OR-PCA for both simulated and real-world datasets.\u0000Tuning-free ORPCA makes it more scalable for large datasets since we do not\u0000require dataset-dependent parameter tuning.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"203 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reranking Laws for Language Generation: A Communication-Theoretic Perspective 语言生成的重新排序法则:传播理论视角
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07131
António Farinhas, Haau-Sing Li, André F. T. Martins
{"title":"Reranking Laws for Language Generation: A Communication-Theoretic Perspective","authors":"António Farinhas, Haau-Sing Li, André F. T. Martins","doi":"arxiv-2409.07131","DOIUrl":"https://doi.org/arxiv-2409.07131","url":null,"abstract":"To ensure large language models (LLMs) are used safely, one must reduce their\u0000propensity to hallucinate or to generate unacceptable answers. A simple and\u0000often used strategy is to first let the LLM generate multiple hypotheses and\u0000then employ a reranker to choose the best one. In this paper, we draw a\u0000parallel between this strategy and the use of redundancy to decrease the error\u0000rate in noisy communication channels. We conceptualize the generator as a\u0000sender transmitting multiple descriptions of a message through parallel noisy\u0000channels. The receiver decodes the message by ranking the (potentially\u0000corrupted) descriptions and selecting the one found to be most reliable. We\u0000provide conditions under which this protocol is asymptotically error-free\u0000(i.e., yields an acceptable answer almost surely) even in scenarios where the\u0000reranker is imperfect (governed by Mallows or Zipf-Mandelbrot models) and the\u0000channel distributions are statistically dependent. We use our framework to\u0000obtain reranking laws which we validate empirically on two real-world tasks\u0000using LLMs: text-to-code generation with DeepSeek-Coder 7B and machine\u0000translation of medical data with TowerInstruct 13B.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From optimal score matching to optimal sampling 从最佳分数匹配到最佳抽样
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07032
Zehao Dou, Subhodh Kotekal, Zhehao Xu, Harrison H. Zhou
{"title":"From optimal score matching to optimal sampling","authors":"Zehao Dou, Subhodh Kotekal, Zhehao Xu, Harrison H. Zhou","doi":"arxiv-2409.07032","DOIUrl":"https://doi.org/arxiv-2409.07032","url":null,"abstract":"The recent, impressive advances in algorithmic generation of high-fidelity\u0000image, audio, and video are largely due to great successes in score-based\u0000diffusion models. A key implementing step is score matching, that is, the\u0000estimation of the score function of the forward diffusion process from training\u0000data. As shown in earlier literature, the total variation distance between the\u0000law of a sample generated from the trained diffusion model and the ground truth\u0000distribution can be controlled by the score matching risk. Despite the widespread use of score-based diffusion models, basic theoretical\u0000questions concerning exact optimal statistical rates for score estimation and\u0000its application to density estimation remain open. We establish the sharp\u0000minimax rate of score estimation for smooth, compactly supported densities.\u0000Formally, given (n) i.i.d. samples from an unknown (alpha)-H\"{o}lder\u0000density (f) supported on ([-1, 1]), we prove the minimax rate of estimating\u0000the score function of the diffused distribution (f * mathcal{N}(0, t)) with\u0000respect to the score matching loss is (frac{1}{nt^2} wedge\u0000frac{1}{nt^{3/2}} wedge (t^{alpha-1} + n^{-2(alpha-1)/(2alpha+1)})) for\u0000all (alpha > 0) and (t ge 0). As a consequence, it is shown the law\u0000(hat{f}) of a sample generated from the diffusion model achieves the sharp\u0000minimax rate (bE(dTV(hat{f}, f)^2) lesssim n^{-2alpha/(2alpha+1)}) for\u0000all (alpha > 0) without any extraneous logarithmic terms which are prevalent\u0000in the literature, and without the need for early stopping which has been\u0000required for all existing procedures to the best of our knowledge.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Training-Free Guidance for Discrete Diffusion Models for Molecular Generation 分子生成离散扩散模型的免训练指导
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07359
Thomas J. Kerby, Kevin R. Moon
{"title":"Training-Free Guidance for Discrete Diffusion Models for Molecular Generation","authors":"Thomas J. Kerby, Kevin R. Moon","doi":"arxiv-2409.07359","DOIUrl":"https://doi.org/arxiv-2409.07359","url":null,"abstract":"Training-free guidance methods for continuous data have seen an explosion of\u0000interest due to the fact that they enable foundation diffusion models to be\u0000paired with interchangable guidance models. Currently, equivalent guidance\u0000methods for discrete diffusion models are unknown. We present a framework for\u0000applying training-free guidance to discrete data and demonstrate its utility on\u0000molecular graph generation tasks using the discrete diffusion model\u0000architecture of DiGress. We pair this model with guidance functions that return\u0000the proportion of heavy atoms that are a specific atom type and the molecular\u0000weight of the heavy atoms and demonstrate our method's ability to guide the\u0000data generation.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models 通过一致性模型对玻尔兹曼分布进行高效无偏采样
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07323
Fengzhe Zhang, Jiajun He, Laurence I. Midgley, Javier Antorán, José Miguel Hernández-Lobato
{"title":"Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models","authors":"Fengzhe Zhang, Jiajun He, Laurence I. Midgley, Javier Antorán, José Miguel Hernández-Lobato","doi":"arxiv-2409.07323","DOIUrl":"https://doi.org/arxiv-2409.07323","url":null,"abstract":"Diffusion models have shown promising potential for advancing Boltzmann\u0000Generators. However, two critical challenges persist: (1) inherent errors in\u0000samples due to model imperfections, and (2) the requirement of hundreds of\u0000functional evaluations (NFEs) to achieve high-quality samples. While existing\u0000solutions like importance sampling and distillation address these issues\u0000separately, they are often incompatible, as most distillation models lack the\u0000necessary density information for importance sampling. This paper introduces a\u0000novel sampling method that effectively combines Consistency Models (CMs) with\u0000importance sampling. We evaluate our approach on both synthetic energy\u0000functions and equivariant n-body particle systems. Our method produces unbiased\u0000samples using only 6-25 NFEs while achieving a comparable Effective Sample Size\u0000(ESS) to Denoising Diffusion Probabilistic Models (DDPMs) that require\u0000approximately 100 NFEs.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manifold Learning via Foliations and Knowledge Transfer 通过对折和知识转移进行多元学习
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.07412
E. Tron, E. Fioresi
{"title":"Manifold Learning via Foliations and Knowledge Transfer","authors":"E. Tron, E. Fioresi","doi":"arxiv-2409.07412","DOIUrl":"https://doi.org/arxiv-2409.07412","url":null,"abstract":"Understanding how real data is distributed in high dimensional spaces is the\u0000key to many tasks in machine learning. We want to provide a natural geometric\u0000structure on the space of data employing a deep ReLU neural network trained as\u0000a classifier. Through the data information matrix (DIM), a variation of the\u0000Fisher information matrix, the model will discern a singular foliation\u0000structure on the space of data. We show that the singular points of such\u0000foliation are contained in a measure zero set, and that a local regular\u0000foliation exists almost everywhere. Experiments show that the data is\u0000correlated with leaves of such foliation. Moreover we show the potential of our\u0000approach for knowledge transfer by analyzing the spectrum of the DIM to measure\u0000distances between datasets.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
k-MLE, k-Bregman, k-VARs: Theory, Convergence, Computation k-MLE, k-Bregman, k-VARs:理论、收敛、计算
arXiv - STAT - Machine Learning Pub Date : 2024-09-11 DOI: arxiv-2409.06938
Zuogong Yue, Victor Solo
{"title":"k-MLE, k-Bregman, k-VARs: Theory, Convergence, Computation","authors":"Zuogong Yue, Victor Solo","doi":"arxiv-2409.06938","DOIUrl":"https://doi.org/arxiv-2409.06938","url":null,"abstract":"We develop hard clustering based on likelihood rather than distance and prove\u0000convergence. We also provide simulations and real data examples.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信