{"title":"算法信息距离的性质","authors":"Marcus Hutter","doi":"10.1109/TIT.2025.3597092","DOIUrl":null,"url":null,"abstract":"The domain-independent universal Normalized Information Distance based on Kolmogorov complexity has been (in approximate form) successfully applied to a variety of difficult clustering problems. In this paper we investigate theoretical properties of the un-normalized algorithmic information distance <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula>. The main question we are asking in this work is what properties this curious distance has, besides being a metric. We show that many (in)finite-dimensional spaces can(not) be isometrically scale-embedded into the space of finite strings with metric <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula>. We also show that <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula> is not an Euclidean distance, but any finite set of points in Euclidean space can be scale-embedded into <inline-formula> <tex-math>$(\\{0,1\\}^{*},d_{K})$ </tex-math></inline-formula>. A major contribution is the development of the necessary framework and tools for finding more (interesting) properties of <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula> in future, and to state several open problems.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 10","pages":"7540-7554"},"PeriodicalIF":2.9000,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Properties of Algorithmic Information Distance\",\"authors\":\"Marcus Hutter\",\"doi\":\"10.1109/TIT.2025.3597092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The domain-independent universal Normalized Information Distance based on Kolmogorov complexity has been (in approximate form) successfully applied to a variety of difficult clustering problems. In this paper we investigate theoretical properties of the un-normalized algorithmic information distance <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula>. The main question we are asking in this work is what properties this curious distance has, besides being a metric. We show that many (in)finite-dimensional spaces can(not) be isometrically scale-embedded into the space of finite strings with metric <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula>. We also show that <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula> is not an Euclidean distance, but any finite set of points in Euclidean space can be scale-embedded into <inline-formula> <tex-math>$(\\\\{0,1\\\\}^{*},d_{K})$ </tex-math></inline-formula>. A major contribution is the development of the necessary framework and tools for finding more (interesting) properties of <inline-formula> <tex-math>$d_{K}$ </tex-math></inline-formula> in future, and to state several open problems.\",\"PeriodicalId\":13494,\"journal\":{\"name\":\"IEEE Transactions on Information Theory\",\"volume\":\"71 10\",\"pages\":\"7540-7554\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Theory\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11121567/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11121567/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
The domain-independent universal Normalized Information Distance based on Kolmogorov complexity has been (in approximate form) successfully applied to a variety of difficult clustering problems. In this paper we investigate theoretical properties of the un-normalized algorithmic information distance $d_{K}$ . The main question we are asking in this work is what properties this curious distance has, besides being a metric. We show that many (in)finite-dimensional spaces can(not) be isometrically scale-embedded into the space of finite strings with metric $d_{K}$ . We also show that $d_{K}$ is not an Euclidean distance, but any finite set of points in Euclidean space can be scale-embedded into $(\{0,1\}^{*},d_{K})$ . A major contribution is the development of the necessary framework and tools for finding more (interesting) properties of $d_{K}$ in future, and to state several open problems.
期刊介绍:
The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.