Laura Maria Palomino Mariño, Francisco de Assis Tenorio de Carvalho
{"title":"Self-organizing maps with adaptive distances for multiple dissimilarity matrices","authors":"Laura Maria Palomino Mariño, Francisco de Assis Tenorio de Carvalho","doi":"10.1007/s10994-024-06607-x","DOIUrl":null,"url":null,"abstract":"<p>There has been an increasing interest in multi-view approaches based on their ability to manage data from several sources. However, regarding unsupervised learning, most multi-view approaches are clustering algorithms suitable for analyzing vector data. Currently, only a relatively few SOM algorithms can manage multi-view dissimilarity data, despite their usefulness. This paper proposes two new families of batch SOM algorithms for multi-view dissimilarity data: multi-medoids SOM and relational SOM, both designed to give a crisp partition and learn the relevance weight for each dissimilarity matrix by optimizing an objective function, aiming to preserve the topological properties of the map data. In both families, the weight represents the relevance of each dissimilarity matrix for the learning task being computed, either locally, for each cluster, or globally, for the whole partition. The proposed algorithms were compared with already in the literature single-view SOM and set-medoids SOM for multi-view dissimilarity data. According to the experiments using 14 datasets for F-measure, NMI, Topographic Error, and Silhouette, the relevance weights of the dissimilarity matrices must be considered. In addition, the multi-medoids and relational SOM performed better than the set-medoids SOM. An application study was also carried out on a dermatology dataset, where the proposed methods have the best performance.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06607-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
There has been an increasing interest in multi-view approaches based on their ability to manage data from several sources. However, regarding unsupervised learning, most multi-view approaches are clustering algorithms suitable for analyzing vector data. Currently, only a relatively few SOM algorithms can manage multi-view dissimilarity data, despite their usefulness. This paper proposes two new families of batch SOM algorithms for multi-view dissimilarity data: multi-medoids SOM and relational SOM, both designed to give a crisp partition and learn the relevance weight for each dissimilarity matrix by optimizing an objective function, aiming to preserve the topological properties of the map data. In both families, the weight represents the relevance of each dissimilarity matrix for the learning task being computed, either locally, for each cluster, or globally, for the whole partition. The proposed algorithms were compared with already in the literature single-view SOM and set-medoids SOM for multi-view dissimilarity data. According to the experiments using 14 datasets for F-measure, NMI, Topographic Error, and Silhouette, the relevance weights of the dissimilarity matrices must be considered. In addition, the multi-medoids and relational SOM performed better than the set-medoids SOM. An application study was also carried out on a dermatology dataset, where the proposed methods have the best performance.
多视图方法能够管理来自多个来源的数据,因此越来越受到人们的关注。然而,在无监督学习方面,大多数多视角方法都是适用于分析向量数据的聚类算法。目前,只有相对较少的 SOM 算法可以管理多视角差异数据,尽管它们非常有用。本文针对多视角异质性数据提出了两个新的批量 SOM 算法系列:多媒介 SOM 和关系 SOM,这两个系列都旨在通过优化目标函数来给出一个清晰的分区并学习每个异质性矩阵的相关性权重,目的是保留地图数据的拓扑特性。在这两个系列中,权重代表了每个异质性矩阵对于正在计算的学习任务的相关性,可以是局部的(针对每个群组),也可以是全局的(针对整个分区)。针对多视角异质性数据,我们将所提出的算法与已有文献中的单视角 SOM 和集合媒介 SOM 进行了比较。根据使用 14 个数据集进行的 F-measure、NMI、Topographic Error 和 Silhouette 实验,必须考虑异质性矩阵的相关性权重。此外,多媒介 SOM 和关系 SOM 的性能优于集合媒介 SOM。我们还在一个皮肤科数据集上进行了应用研究,发现所提出的方法在该数据集上表现最佳。
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.