Journal of Machine Learning Research最新文献_第4页

Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks. 学习贝叶斯网络的一致二阶圆锥整数编程

IF 4.3 3区计算机科学

Journal of Machine Learning Research Pub Date : 2023-01-01

Simge Küçükyavuz, Ali Shojaie, Hasan Manzour, Linchuan Wei, Hao-Hsiang Wu

{"title":"Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks.","authors":"Simge Küçükyavuz, Ali Shojaie, Hasan Manzour, Linchuan Wei, Hao-Hsiang Wu","doi":"","DOIUrl":"","url":null,"abstract":"Bayesian Networks (BNs) represent conditional probability relations among a set of random variables (nodes) in the form of a directed acyclic graph (DAG), and have found diverse applications in knowledge discovery. We study the problem of learning the sparse DAG structure of a BN from continuous observational data. The central problem can be modeled as a mixed-integer program with an objective function composed of a convex quadratic loss function and a regularization penalty subject to linear constraints. The optimal solution to this mathematical program is known to have desirable statistical properties under certain conditions. However, the state-of-the-art optimization solvers are not able to obtain provably optimal solutions to the existing mathematical formulations for medium-size problems within reasonable computational times. To address this difficulty, we tackle the problem from both computational and statistical perspectives. On the one hand, we propose a concrete early stopping criterion to terminate the branch-and-bound process in order to obtain a near-optimal solution to the mixed-integer program, and establish the consistency of this approximate solution. On the other hand, we improve the existing formulations by replacing the linear \"big- <math><mi>M</mi></math> \" constraints that represent the relationship between the continuous and binary indicator variables with second-order conic constraints. Our numerical results demonstrate the effectiveness of the proposed approaches.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"24 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257021/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141724946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays. 广义矩阵因式分解：为大型数据阵列拟合广义线性潜变量模型的高效算法。

IF 6 3区计算机科学

Journal of Machine Learning Research Pub Date : 2022-11-01

Łukasz Kidziński, Francis K C Hui, David I Warton, Trevor Hastie

{"title":"Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays.","authors":"Łukasz Kidziński, Francis K C Hui, David I Warton, Trevor Hastie","doi":"","DOIUrl":"","url":null,"abstract":"Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools such as factor analysis or principal component analysis with a well-established theory and fast algorithms. Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. However, current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets with thousands of observational units or responses. In this article, we propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood and then using a Newton method and Fisher scoring to learn the model parameters. Computationally, our method is noticeably faster and more stable, enabling GLLVM fits to much larger matrices than previously possible. We apply our method on a dataset of 48,000 observational units with over 2,000 observed species in each unit and find that most of the variability can be explained with a handful of factors. We publish an easy-to-use implementation of our proposed fitting algorithm.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"23 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10129058/pdf/nihms-1843577.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9391635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tree-based Node Aggregation in Sparse Graphical Models. 稀疏图形模型中基于树的节点聚合

IF 4.3 3区计算机科学

Journal of Machine Learning Research Pub Date : 2022-09-01

Ines Wilms, Jacob Bien

引用次数: 0

Reinforcement Learning Algorithm for Mixed Mean Field Control Games 混合平均场控制博弈的强化学习算法

IF 6 3区计算机科学

Journal of Machine Learning Research Pub Date : 2022-05-04 DOI: 10.4208/jml.220915

Andrea Angiuli, Nils Detering, J. Fouque, M. Laurière, Jimin Lin

引用次数: 6

Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes 超越二次逼近:神经网络损失景观的多尺度结构

IF 6 3区计算机科学

Journal of Machine Learning Research Pub Date : 2022-04-24 DOI: 10.4208/jml.220404

Chao Ma, D. Kunin, Lei Wu, Lexing Ying

{"title":"Beyond the Quadratic Approximation: The Multiscale Structure of Neural Network Loss Landscapes","authors":"Chao Ma, D. Kunin, Lei Wu, Lexing Ying","doi":"10.4208/jml.220404","DOIUrl":"https://doi.org/10.4208/jml.220404","url":null,"abstract":"A quadratic approximation of neural network loss landscapes has been extensively used to study the optimization process of these networks. Though, it usually holds in a very small neighborhood of the minimum, it cannot explain many phenomena observed during the optimization process. In this work, we study the structure of neural network loss functions and its implication on optimization in a region beyond the reach of a good quadratic approximation. Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly. Using the subquadratic growth, we are able to explain the Edge of Stability phenomenon [5] observed for the gradient descent (GD) method. Using the separate scales, we explain the working mechanism of learning rate decay by simple examples. Finally, we study the origin of the multiscale structure and propose that the non-convexity of the models and the non-uniformity of training data is one of the causes. By constructing a two-layer neural network problem we show that training data with different magnitudes give rise to different scales of the loss function, producing subquadratic growth and multiple separate scales.","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"49 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2022-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88018799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Tree-Values: Selective Inference for Regression Trees. 树值：回归树的选择性推理

IF 4.3 3区计算机科学