聚类、编码和相似性概念

IF 1 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Annals of Mathematics and Artificial Intelligence Pub Date : 2024-03-19 DOI:10.1007/s10472-024-09929-7

L. Thorne McCarty

{"title":"聚类、编码和相似性概念","authors":"L. Thorne McCarty","doi":"10.1007/s10472-024-09929-7","DOIUrl":null,"url":null,"abstract":"<div>This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \\({g}_{ij}(\\textbf{x})\\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \\(U(\\textbf{x})\\), and its gradient, \\(\\nabla U(\\textbf{x})\\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.</div>","PeriodicalId":7971,"journal":{"name":"Annals of Mathematics and Artificial Intelligence","volume":"92 5","pages":"1197 - 1248"},"PeriodicalIF":1.0000,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering, coding, and the concept of similarity\",\"authors\":\"L. Thorne McCarty\",\"doi\":\"10.1007/s10472-024-09929-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \\\\({g}_{ij}(\\\\textbf{x})\\\\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \\\\(U(\\\\textbf{x})\\\\), and its gradient, \\\\(\\\\nabla U(\\\\textbf{x})\\\\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.</div>\",\"PeriodicalId\":7971,\"journal\":{\"name\":\"Annals of Mathematics and Artificial Intelligence\",\"volume\":\"92 5\",\"pages\":\"1197 - 1248\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-03-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Mathematics and Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10472-024-09929-7\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Mathematics and Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10472-024-09929-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种聚类和编码理论，它以一种原则性的方式将几何模型与概率模型相结合。几何模型是一个具有黎曼度量的黎曼流形，我们将其解释为异质性度量。概率模型包括一个随机过程，其不变概率度量与样本输入数据的密度相匹配。这两个模型之间的联系是一个势函数（U(\textbf{x})\）及其梯度（U(\textbf{x})\）。我们使用梯度来定义相似度量，这保证了我们的相似度量将取决于概率度量。最后，我们利用异质性度量定义嵌入黎曼流形上的坐标系，从而得到原始数据的低维编码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Clustering, coding, and the concept of similarity

This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \({g}_{ij}(\textbf{x})\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \(U(\textbf{x})\), and its gradient, \(\nabla U(\textbf{x})\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annals of Mathematics and Artificial Intelligence 工程技术-计算机：人工智能

CiteScore

3.00

自引率

8.30%

发文量

审稿时长

>12 weeks

期刊介绍： Annals of Mathematics and Artificial Intelligence presents a range of topics of concern to scholars applying quantitative, combinatorial, logical, algebraic and algorithmic methods to diverse areas of Artificial Intelligence, from decision support, automated deduction, and reasoning, to knowledge-based systems, machine learning, computer vision, robotics and planning. The journal features collections of papers appearing either in volumes (400 pages) or in separate issues (100-300 pages), which focus on one topic and have one or more guest editors. Annals of Mathematics and Artificial Intelligence hopes to influence the spawning of new areas of applied mathematics and strengthen the scientific underpinnings of Artificial Intelligence.