基于子一准规范的 k-Means 聚类算法及分析

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters Pub Date : 2024-05-13 DOI:10.1007/s11063-024-11615-y

Qi An, Shan Jiang

{"title":"基于子一准规范的 k-Means 聚类算法及分析","authors":"Qi An, Shan Jiang","doi":"10.1007/s11063-024-11615-y","DOIUrl":null,"url":null,"abstract":"Recognizing the pivotal role of choosing an appropriate distance metric in designing the clustering algorithm, our focus is on innovating the k-means method by redefining the distance metric in its distortion. In this study, we introduce a novel k-means clustering algorithm utilizing a distance metric derived from the \\(\\ell _p\\) quasi-norm with \\(p\\in (0,1)\\). Through an illustrative example, we showcase the advantageous properties of the proposed distance metric compared to commonly used alternatives for revealing natural groupings in data. Subsequently, we present a novel k-means type heuristic by integrating this sub-one quasi-norm-based distance, offer a step-by-step iterative relocation scheme, and prove the convergence to the Kuhn-Tucker point. Finally, we empirically validate the effectiveness of our clustering method through experiments on synthetic and real-life datasets, both in their original form and with additional noise introduced. We also investigate the performance of the proposed method as a subroutine in a deep learning clustering algorithm. Our results demonstrate the efficacy of the proposed k-means algorithm in capturing distinctive patterns exhibited by certain data types.","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"46 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses\",\"authors\":\"Qi An, Shan Jiang\",\"doi\":\"10.1007/s11063-024-11615-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognizing the pivotal role of choosing an appropriate distance metric in designing the clustering algorithm, our focus is on innovating the k-means method by redefining the distance metric in its distortion. In this study, we introduce a novel k-means clustering algorithm utilizing a distance metric derived from the \\\\(\\\\ell _p\\\\) quasi-norm with \\\\(p\\\\in (0,1)\\\\). Through an illustrative example, we showcase the advantageous properties of the proposed distance metric compared to commonly used alternatives for revealing natural groupings in data. Subsequently, we present a novel k-means type heuristic by integrating this sub-one quasi-norm-based distance, offer a step-by-step iterative relocation scheme, and prove the convergence to the Kuhn-Tucker point. Finally, we empirically validate the effectiveness of our clustering method through experiments on synthetic and real-life datasets, both in their original form and with additional noise introduced. We also investigate the performance of the proposed method as a subroutine in a deep learning clustering algorithm. Our results demonstrate the efficacy of the proposed k-means algorithm in capturing distinctive patterns exhibited by certain data types.\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"46 1\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11615-y\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11615-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

认识到选择合适的距离度量在设计聚类算法中的关键作用，我们的重点是通过重新定义变形中的距离度量来创新 k-means 方法。在这项研究中，我们介绍了一种新颖的 k-means 聚类算法，该算法使用的距离度量来自 \(ell _p\) quasi-norm with \(p\in (0,1)\)。通过一个示例，我们展示了所提出的距离度量与常用的其他度量相比在揭示数据自然分组方面的优势特性。随后，我们提出了一种新颖的 k-means 类型启发式，通过整合这种基于子一准规范的距离，提供了一种逐步迭代的重定位方案，并证明了其对 Kuhn-Tucker 点的收敛性。最后，我们通过对合成数据集和实际数据集的实验，验证了我们的聚类方法的有效性，包括原始数据集和引入额外噪声的数据集。我们还研究了作为深度学习聚类算法子程序的拟议方法的性能。我们的研究结果表明，所提出的 k-means 算法能有效捕捉某些数据类型所表现出的独特模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses

查看原文本刊更多论文

Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses

Recognizing the pivotal role of choosing an appropriate distance metric in designing the clustering algorithm, our focus is on innovating the k-means method by redefining the distance metric in its distortion. In this study, we introduce a novel k-means clustering algorithm utilizing a distance metric derived from the \(\ell _p\) quasi-norm with \(p\in (0,1)\). Through an illustrative example, we showcase the advantageous properties of the proposed distance metric compared to commonly used alternatives for revealing natural groupings in data. Subsequently, we present a novel k-means type heuristic by integrating this sub-one quasi-norm-based distance, offer a step-by-step iterative relocation scheme, and prove the convergence to the Kuhn-Tucker point. Finally, we empirically validate the effectiveness of our clustering method through experiments on synthetic and real-life datasets, both in their original form and with additional noise introduced. We also investigate the performance of the proposed method as a subroutine in a deep learning clustering algorithm. Our results demonstrate the efficacy of the proposed k-means algorithm in capturing distinctive patterns exhibited by certain data types.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters