{"title":"PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities","authors":"Daniel Zilberg, Ron Levie","doi":"arxiv-2409.11618","DOIUrl":"https://doi.org/arxiv-2409.11618","url":null,"abstract":"We propose PieClam (Prior Inclusive Exclusive Cluster Affiliation Model): a\u0000probabilistic graph model for representing any graph as overlapping generalized\u0000communities. Our method can be interpreted as a graph autoencoder: nodes are\u0000embedded into a code space by an algorithm that maximizes the log-likelihood of\u0000the decoded graph, given the input graph. PieClam is a community affiliation\u0000model that extends well-known methods like BigClam in two main manners. First,\u0000instead of the decoder being defined via pairwise interactions between the\u0000nodes in the code space, we also incorporate a learned prior on the\u0000distribution of nodes in the code space, turning our method into a graph\u0000generative model. Secondly, we generalize the notion of communities by allowing\u0000not only sets of nodes with strong connectivity, which we call inclusive\u0000communities, but also sets of nodes with strong disconnection, which we call\u0000exclusive communities. To model both types of communities, we propose a new\u0000type of decoder based the Lorentz inner product, which we prove to be much more\u0000expressive than standard decoders based on standard inner products or norm\u0000distances. By introducing a new graph similarity measure, that we call the log\u0000cut distance, we show that PieClam is a universal autoencoder, able to\u0000uniformly approximately reconstruct any graph. Our method is shown to obtain\u0000competitive performance in graph anomaly detection benchmarks.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recurrent Interpolants for Probabilistic Time Series Prediction","authors":"Yu Chen, Marin Biloš, Sarthak Mittal, Wei Deng, Kashif Rasul, Anderson Schneider","doi":"arxiv-2409.11684","DOIUrl":"https://doi.org/arxiv-2409.11684","url":null,"abstract":"Sequential models such as recurrent neural networks or transformer-based\u0000models became textit{de facto} tools for multivariate time series forecasting\u0000in a probabilistic fashion, with applications to a wide range of datasets, such\u0000as finance, biology, medicine, etc. Despite their adeptness in capturing\u0000dependencies, assessing prediction uncertainty, and efficiency in training,\u0000challenges emerge in modeling high-dimensional complex distributions and\u0000cross-feature dependencies. To tackle these issues, recent works delve into\u0000generative modeling by employing diffusion or flow-based models. Notably, the\u0000integration of stochastic differential equations or probability flow\u0000successfully extends these methods to probabilistic time series imputation and\u0000forecasting. However, scalability issues necessitate a computational-friendly\u0000framework for large-scale generative model-based predictions. This work\u0000proposes a novel approach by blending the computational efficiency of recurrent\u0000neural networks with the high-quality probabilistic modeling of the diffusion\u0000model, which addresses challenges and advances generative models' application\u0000in time series forecasting. Our method relies on the foundation of stochastic\u0000interpolants and the extension to a broader conditional generation framework\u0000with additional control features, offering insights for future developments in\u0000this dynamic field.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fitting Multilevel Factor Models","authors":"Tetiana Parshakova, Trevor Hastie, Stephen Boyd","doi":"arxiv-2409.12067","DOIUrl":"https://doi.org/arxiv-2409.12067","url":null,"abstract":"We examine a special case of the multilevel factor model, with covariance\u0000given by multilevel low rank (MLR) matrix~cite{parshakova2023factor}. We\u0000develop a novel, fast implementation of the expectation-maximization (EM)\u0000algorithm, tailored for multilevel factor models, to maximize the likelihood of\u0000the observed data. This method accommodates any hierarchical structure and\u0000maintains linear time and storage complexities per iteration. This is achieved\u0000through a new efficient technique for computing the inverse of the positive\u0000definite MLR matrix. We show that the inverse of an invertible PSD MLR matrix\u0000is also an MLR matrix with the same sparsity in factors, and we use the\u0000recursive Sherman-Morrison-Woodbury matrix identity to obtain the factors of\u0000the inverse. Additionally, we present an algorithm that computes the Cholesky\u0000factorization of an expanded matrix with linear time and space complexities,\u0000yielding the covariance matrix as its Schur complement. This paper is\u0000accompanied by an open-source package that implements the proposed methods.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eliot Tron, Rita Fioresi, Nicolas Couellan, Stéphane Puechmorel
{"title":"Cartan moving frames and the data manifolds","authors":"Eliot Tron, Rita Fioresi, Nicolas Couellan, Stéphane Puechmorel","doi":"arxiv-2409.12057","DOIUrl":"https://doi.org/arxiv-2409.12057","url":null,"abstract":"The purpose of this paper is to employ the language of Cartan moving frames\u0000to study the geometry of the data manifolds and its Riemannian structure, via\u0000the data information metric and its curvature at data points. Using this\u0000framework and through experiments, explanations on the response of a neural\u0000network are given by pointing out the output classes that are easily reachable\u0000from a given input. This emphasizes how the proposed mathematical relationship\u0000between the output of the network and the geometry of its inputs can be\u0000exploited as an explainable artificial intelligence tool.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi
{"title":"Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks","authors":"Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi","doi":"arxiv-2409.11772","DOIUrl":"https://doi.org/arxiv-2409.11772","url":null,"abstract":"There has been much recent interest in designing symmetry-aware neural\u0000networks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolate\u0000between being exactly equivariant and being fully flexible, affording\u0000consistent performance benefits. In a separate line of work, certain structured\u0000parameter matrices -- those with displacement structure, characterized by low\u0000displacement rank (LDR) -- have been used to design small-footprint NNs.\u0000Displacement structure enables fast function and gradient evaluation, but\u0000permits accurate approximations via compression primarily to classical\u0000convolutional neural networks (CNNs). In this work, we propose a general\u0000framework -- based on a novel construction of symmetry-based structured\u0000matrices -- to build approximately equivariant NNs with significantly reduced\u0000parameter counts. Our framework integrates the two aforementioned lines of work\u0000via the use of so-called Group Matrices (GMs), a forgotten precursor to the\u0000modern notion of regular representations of finite groups. GMs allow the design\u0000of structured matrices -- resembling LDR matrices -- which generalize the\u0000linear operations of a classical CNN from cyclic groups to general finite\u0000groups and their homogeneous spaces. We show that GMs can be employed to extend\u0000all the elementary operations of CNNs to general discrete groups. Further, the\u0000theory of structured matrices based on GMs provides a generalization of LDR\u0000theory focussed on matrices with cyclic structure, providing a tool for\u0000implementing approximate equivariance for discrete groups. We test GM-based\u0000architectures on a variety of tasks in the presence of relaxed symmetry. We\u0000report that our framework consistently performs competitively compared to\u0000approximately equivariant NNs, and other structured matrix-based compression\u0000frameworks, sometimes with a one or two orders of magnitude lower parameter\u0000count.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Unstable Continuous-Time Stochastic Linear Control Systems","authors":"Reza Sadeghi Hafshejani, Mohamad Kazem Shirani Fradonbeh","doi":"arxiv-2409.11327","DOIUrl":"https://doi.org/arxiv-2409.11327","url":null,"abstract":"We study the problem of system identification for stochastic continuous-time\u0000dynamics, based on a single finite-length state trajectory. We present a method\u0000for estimating the possibly unstable open-loop matrix by employing properly\u0000randomized control inputs. Then, we establish theoretical performance\u0000guarantees showing that the estimation error decays with trajectory length, a\u0000measure of excitability, and the signal-to-noise ratio, while it grows with\u0000dimension. Numerical illustrations that showcase the rates of learning the\u0000dynamics, will be provided as well. To perform the theoretical analysis, we\u0000develop new technical tools that are of independent interest. That includes\u0000non-asymptotic stochastic bounds for highly non-stationary martingales and\u0000generalized laws of iterated logarithms, among others.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"119 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Priscilla Ong, Manuel Haußmann, Otto Lönnroth, Harri Lähdesmäki
{"title":"Latent mixed-effect models for high-dimensional longitudinal data","authors":"Priscilla Ong, Manuel Haußmann, Otto Lönnroth, Harri Lähdesmäki","doi":"arxiv-2409.11008","DOIUrl":"https://doi.org/arxiv-2409.11008","url":null,"abstract":"Modelling longitudinal data is an important yet challenging task. These\u0000datasets can be high-dimensional, contain non-linear effects and time-varying\u0000covariates. Gaussian process (GP) prior-based variational autoencoders (VAEs)\u0000have emerged as a promising approach due to their ability to model time-series\u0000data. However, they are costly to train and struggle to fully exploit the rich\u0000covariates characteristic of longitudinal data, making them difficult for\u0000practitioners to use effectively. In this work, we leverage linear mixed models\u0000(LMMs) and amortized variational inference to provide conditional priors for\u0000VAEs, and propose LMM-VAE, a scalable, interpretable and identifiable model. We\u0000highlight theoretical connections between it and GP-based techniques, providing\u0000a unified framework for this class of methods. Our proposal performs\u0000competitively compared to existing approaches across simulated and real-world\u0000datasets.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"212 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the generalization ability of coarse-grained molecular dynamics models for non-equilibrium processes","authors":"Liyao Lyu, Huan Lei","doi":"arxiv-2409.11519","DOIUrl":"https://doi.org/arxiv-2409.11519","url":null,"abstract":"One essential goal of constructing coarse-grained molecular dynamics (CGMD)\u0000models is to accurately predict non-equilibrium processes beyond the atomistic\u0000scale. While a CG model can be constructed by projecting the full dynamics onto\u0000a set of resolved variables, the dynamics of the CG variables can recover the\u0000full dynamics only when the conditional distribution of the unresolved\u0000variables is close to the one associated with the particular projection\u0000operator. In particular, the model's applicability to various non-equilibrium\u0000processes is generally unwarranted due to the inconsistency in the conditional\u0000distribution. Here, we present a data-driven approach for constructing CGMD\u0000models that retain certain generalization ability for non-equilibrium\u0000processes. Unlike the conventional CG models based on pre-selected CG variables\u0000(e.g., the center of mass), the present CG model seeks a set of auxiliary CG\u0000variables based on the time-lagged independent component analysis to minimize\u0000the entropy contribution of the unresolved variables. This ensures the\u0000distribution of the unresolved variables under a broad range of non-equilibrium\u0000conditions approaches the one under equilibrium. Numerical results of a polymer\u0000melt system demonstrate the significance of this broadly-overlooked metric for\u0000the model's generalization ability, and the effectiveness of the present CG\u0000model for predicting the complex viscoelastic responses under various\u0000non-equilibrium flows.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Outlier Detection with Cluster Catch Digraphs","authors":"Rui Shi, Nedret Billor, Elvan Ceyhan","doi":"arxiv-2409.11596","DOIUrl":"https://doi.org/arxiv-2409.11596","url":null,"abstract":"This paper introduces a novel family of outlier detection algorithms based on\u0000Cluster Catch Digraphs (CCDs), specifically tailored to address the challenges\u0000of high dimensionality and varying cluster shapes, which deteriorate the\u0000performance of most traditional outlier detection methods. We propose the\u0000Uniformity-Based CCD with Mutual Catch Graph (U-MCCD), the Uniformity- and\u0000Neighbor-Based CCD with Mutual Catch Graph (UN-MCCD), and their shape-adaptive\u0000variants (SU-MCCD and SUN-MCCD), which are designed to detect outliers in data\u0000sets with arbitrary cluster shapes and high dimensions. We present the\u0000advantages and shortcomings of these algorithms and provide the motivation or\u0000need to define each particular algorithm. Through comprehensive Monte Carlo\u0000simulations, we assess their performance and demonstrate the robustness and\u0000effectiveness of our algorithms across various settings and contamination\u0000levels. We also illustrate the use of our algorithms on various real-life data\u0000sets. The U-MCCD algorithm efficiently identifies outliers while maintaining\u0000high true negative rates, and the SU-MCCD algorithm shows substantial\u0000improvement in handling non-uniform clusters. Additionally, the UN-MCCD and\u0000SUN-MCCD algorithms address the limitations of existing methods in\u0000high-dimensional spaces by utilizing Nearest Neighbor Distances (NND) for\u0000clustering and outlier detection. Our results indicate that these novel\u0000algorithms offer substantial advancements in the accuracy and adaptability of\u0000outlier detection, providing a valuable tool for various real-world\u0000applications. Keyword: Outlier detection, Graph-based clustering, Cluster catch digraphs,\u0000$k$-nearest-neighborhood, Mutual catch graphs, Nearest neighbor distance.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Gaussian Process for operator learning: an uncertainty aware resolution independent operator learning algorithm for computational mechanics","authors":"Sawan Kumar, Rajdip Nayek, Souvik Chakraborty","doi":"arxiv-2409.10972","DOIUrl":"https://doi.org/arxiv-2409.10972","url":null,"abstract":"The growing demand for accurate, efficient, and scalable solutions in\u0000computational mechanics highlights the need for advanced operator learning\u0000algorithms that can efficiently handle large datasets while providing reliable\u0000uncertainty quantification. This paper introduces a novel Gaussian Process (GP)\u0000based neural operator for solving parametric differential equations. The\u0000approach proposed leverages the expressive capability of deterministic neural\u0000operators and the uncertainty awareness of conventional GP. In particular, we\u0000propose a ``neural operator-embedded kernel'' wherein the GP kernel is\u0000formulated in the latent space learned using a neural operator. Further, we\u0000exploit a stochastic dual descent (SDD) algorithm for simultaneously training\u0000the neural operator parameters and the GP hyperparameters. Our approach\u0000addresses the (a) resolution dependence and (b) cubic complexity of traditional\u0000GP models, allowing for input-resolution independence and scalability in\u0000high-dimensional and non-linear parametric systems, such as those encountered\u0000in computational mechanics. We apply our method to a range of non-linear\u0000parametric partial differential equations (PDEs) and demonstrate its\u0000superiority in both computational efficiency and accuracy compared to standard\u0000GP models and wavelet neural operators. Our experimental results highlight the\u0000efficacy of this framework in solving complex PDEs while maintaining robustness\u0000in uncertainty estimation, positioning it as a scalable and reliable\u0000operator-learning algorithm for computational mechanics.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142261680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}