Bridging Algorithmic Information Theory and Machine Learning: Clustering, density estimation, Kolmogorov complexity-based kernels, and kernel learning in unsupervised learning

IF 2.7 3区数学 Q1 MATHEMATICS, APPLIED

Physica D: Nonlinear Phenomena Pub Date : 2025-04-10 DOI:10.1016/j.physd.2025.134669

Boumediene Hamzi , Marcus Hutter , Houman Owhadi

{"title":"Bridging Algorithmic Information Theory and Machine Learning: Clustering, density estimation, Kolmogorov complexity-based kernels, and kernel learning in unsupervised learning","authors":"Boumediene Hamzi , Marcus Hutter , Houman Owhadi","doi":"10.1016/j.physd.2025.134669","DOIUrl":null,"url":null,"abstract":"<div><div>Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary approaches to understanding and addressing complexity. This paper investigates the synergy between these disciplines in two directions: <em>AIT for Kernel Methods</em> and <em>Kernel Methods for AIT</em>. In the former, we explore how AIT concepts inspire the design of kernels that integrate principles like relative Kolmogorov complexity and normalized compression distance (NCD). We propose a novel clustering method utilizing the Minimum Description Length principle, implemented via K-means and Kernel Mean Embedding (KME). Additionally, we apply the Loss Rank Principle (LoRP) to learn optimal kernel parameters in the context of Kernel Density Estimation (KDE), thereby extending the applicability of AIT-inspired techniques to flexible, nonparametric models. In the latter, we show how kernel methods can be used to approximate measures such as NCD and Algorithmic Mutual Information (AMI), providing new tools for compression-based analysis. Furthermore, we demonstrate that the Hilbert–Schmidt Independence Criterion (HSIC) approximates AMI, offering a robust theoretical foundation for clustering and other dependence-measurement tasks. Building on our previous work introducing Sparse Kernel Flows from an AIT perspective, we extend these ideas to unsupervised learning, enhancing the theoretical robustness and interpretability of ML algorithms. Our results demonstrate that kernel methods are not only versatile tools for ML but also crucial for bridging AIT and ML, enabling more principled approaches to unsupervised learning tasks.</div></div>","PeriodicalId":20050,"journal":{"name":"Physica D: Nonlinear Phenomena","volume":"476 ","pages":"Article 134669"},"PeriodicalIF":2.7000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica D: Nonlinear Phenomena","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167278925001484","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 0

Abstract

Machine Learning (ML) and Algorithmic Information Theory (AIT) offer distinct yet complementary approaches to understanding and addressing complexity. This paper investigates the synergy between these disciplines in two directions: AIT for Kernel Methods and Kernel Methods for AIT. In the former, we explore how AIT concepts inspire the design of kernels that integrate principles like relative Kolmogorov complexity and normalized compression distance (NCD). We propose a novel clustering method utilizing the Minimum Description Length principle, implemented via K-means and Kernel Mean Embedding (KME). Additionally, we apply the Loss Rank Principle (LoRP) to learn optimal kernel parameters in the context of Kernel Density Estimation (KDE), thereby extending the applicability of AIT-inspired techniques to flexible, nonparametric models. In the latter, we show how kernel methods can be used to approximate measures such as NCD and Algorithmic Mutual Information (AMI), providing new tools for compression-based analysis. Furthermore, we demonstrate that the Hilbert–Schmidt Independence Criterion (HSIC) approximates AMI, offering a robust theoretical foundation for clustering and other dependence-measurement tasks. Building on our previous work introducing Sparse Kernel Flows from an AIT perspective, we extend these ideas to unsupervised learning, enhancing the theoretical robustness and interpretability of ML algorithms. Our results demonstrate that kernel methods are not only versatile tools for ML but also crucial for bridging AIT and ML, enabling more principled approaches to unsupervised learning tasks.

查看原文本刊更多论文

桥接算法信息理论和机器学习：聚类，密度估计，基于Kolmogorov复杂度的核，以及无监督学习中的核学习

机器学习（ML）和算法信息论（AIT）为理解和解决复杂性提供了不同但互补的方法。本文从两个方向探讨了这些学科之间的协同作用：面向核方法的人工智能和面向人工智能的核方法。在前者中，我们探讨了AIT概念如何启发集成了相对Kolmogorov复杂度和规范化压缩距离（NCD）等原则的核的设计。我们提出了一种新的聚类方法，利用最小描述长度原理，通过k均值和核均值嵌入（KME）实现。此外，我们应用损失等级原则（LoRP）在核密度估计（KDE）的背景下学习最优核参数，从而将ait启发的技术的适用性扩展到灵活的非参数模型。在后者中，我们展示了如何使用核方法来近似NCD和算法互信息（AMI）等度量，为基于压缩的分析提供了新的工具。此外，我们证明了Hilbert-Schmidt独立准则（HSIC）近似AMI，为聚类和其他依赖性测量任务提供了强大的理论基础。在我们之前从AIT的角度介绍稀疏核流的工作的基础上，我们将这些想法扩展到无监督学习，增强了机器学习算法的理论鲁棒性和可解释性。我们的研究结果表明，核方法不仅是机器学习的通用工具，而且对于连接AIT和机器学习也至关重要，可以为无监督学习任务提供更有原则的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Physica D: Nonlinear Phenomena 物理-物理：数学物理

CiteScore

7.30

自引率

7.50%

发文量

213

审稿时长

65 days

期刊介绍： Physica D (Nonlinear Phenomena) publishes research and review articles reporting on experimental and theoretical works, techniques and ideas that advance the understanding of nonlinear phenomena. Topics encompass wave motion in physical, chemical and biological systems; physical or biological phenomena governed by nonlinear field equations, including hydrodynamics and turbulence; pattern formation and cooperative phenomena; instability, bifurcations, chaos, and space-time disorder; integrable/Hamiltonian systems; asymptotic analysis and, more generally, mathematical methods for nonlinear systems.