Bridging Algorithmic Information Theory and Machine Learning: Clustering, density estimation, Kolmogorov complexity-based kernels, and kernel learning in unsupervised learning
{"title":"Bridging Algorithmic Information Theory and Machine Learning: Clustering, density estimation, Kolmogorov complexity-based kernels, and kernel learning in unsupervised learning","authors":"Boumediene Hamzi , Marcus Hutter , Houman Owhadi","doi":"10.1016/j.physd.2025.134669","DOIUrl":null,"url":null,"abstract":"<div><div>Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary approaches to understanding and addressing complexity. This paper investigates the synergy between these disciplines in two directions: <em>AIT for Kernel Methods</em> and <em>Kernel Methods for AIT</em>. In the former, we explore how AIT concepts inspire the design of kernels that integrate principles like relative Kolmogorov complexity and normalized compression distance (NCD). We propose a novel clustering method utilizing the Minimum Description Length principle, implemented via K-means and Kernel Mean Embedding (KME). Additionally, we apply the Loss Rank Principle (LoRP) to learn optimal kernel parameters in the context of Kernel Density Estimation (KDE), thereby extending the applicability of AIT-inspired techniques to flexible, nonparametric models. In the latter, we show how kernel methods can be used to approximate measures such as NCD and Algorithmic Mutual Information (AMI), providing new tools for compression-based analysis. Furthermore, we demonstrate that the Hilbert–Schmidt Independence Criterion (HSIC) approximates AMI, offering a robust theoretical foundation for clustering and other dependence-measurement tasks. Building on our previous work introducing Sparse Kernel Flows from an AIT perspective, we extend these ideas to unsupervised learning, enhancing the theoretical robustness and interpretability of ML algorithms. Our results demonstrate that kernel methods are not only versatile tools for ML but also crucial for bridging AIT and ML, enabling more principled approaches to unsupervised learning tasks.</div></div>","PeriodicalId":20050,"journal":{"name":"Physica D: Nonlinear Phenomena","volume":"476 ","pages":"Article 134669"},"PeriodicalIF":2.7000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica D: Nonlinear Phenomena","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167278925001484","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary approaches to understanding and addressing complexity. This paper investigates the synergy between these disciplines in two directions: AIT for Kernel Methods and Kernel Methods for AIT. In the former, we explore how AIT concepts inspire the design of kernels that integrate principles like relative Kolmogorov complexity and normalized compression distance (NCD). We propose a novel clustering method utilizing the Minimum Description Length principle, implemented via K-means and Kernel Mean Embedding (KME). Additionally, we apply the Loss Rank Principle (LoRP) to learn optimal kernel parameters in the context of Kernel Density Estimation (KDE), thereby extending the applicability of AIT-inspired techniques to flexible, nonparametric models. In the latter, we show how kernel methods can be used to approximate measures such as NCD and Algorithmic Mutual Information (AMI), providing new tools for compression-based analysis. Furthermore, we demonstrate that the Hilbert–Schmidt Independence Criterion (HSIC) approximates AMI, offering a robust theoretical foundation for clustering and other dependence-measurement tasks. Building on our previous work introducing Sparse Kernel Flows from an AIT perspective, we extend these ideas to unsupervised learning, enhancing the theoretical robustness and interpretability of ML algorithms. Our results demonstrate that kernel methods are not only versatile tools for ML but also crucial for bridging AIT and ML, enabling more principled approaches to unsupervised learning tasks.
期刊介绍:
Physica D (Nonlinear Phenomena) publishes research and review articles reporting on experimental and theoretical works, techniques and ideas that advance the understanding of nonlinear phenomena. Topics encompass wave motion in physical, chemical and biological systems; physical or biological phenomena governed by nonlinear field equations, including hydrodynamics and turbulence; pattern formation and cooperative phenomena; instability, bifurcations, chaos, and space-time disorder; integrable/Hamiltonian systems; asymptotic analysis and, more generally, mathematical methods for nonlinear systems.