{"title":"基于切空间变化的流形模型数据的渐进聚类","authors":"G. Gokdogan, Elif Vural","doi":"10.1109/MLSP.2017.8168182","DOIUrl":null,"url":null,"abstract":"An important research topic of the recent years has been to understand and analyze manifold-modeled data for clustering and classification applications. Most clustering methods developed for data of non-linear and low-dimensional structure are based on local linearity assumptions. However, clustering algorithms based on locally linear representations can tolerate difficult sampling conditions only to some extent, and may fail for scarcely sampled data manifolds or at high-curvature regions. In this paper, we consider a setting where each cluster is concentrated around a manifold and propose a manifold clustering algorithm that relies on the observation that the variation of the tangent space must be consistent along curves over the same data manifold. In order to achieve robustness against challenges due to noise, manifold intersections, and high curvature, we propose a progressive clustering approach: Observing the variation of the tangent space, we first detect the non-problematic manifold regions and form pre-clusters with the data samples belonging to such reliable regions. Next, these pre-clusters are merged together to form larger clusters with respect to constraints on both the distance and the tangent space variations. Finally, the samples identified as problematic are also assigned to the computed clusters to finalize the clustering. Experiments with synthetic and real datasets show that the proposed method outperforms the manifold clustering algorithms in comparison based on Euclidean distance and sparse representations.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"10 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Progressive clustering of manifold-modeled data based on tangent space variations\",\"authors\":\"G. Gokdogan, Elif Vural\",\"doi\":\"10.1109/MLSP.2017.8168182\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An important research topic of the recent years has been to understand and analyze manifold-modeled data for clustering and classification applications. Most clustering methods developed for data of non-linear and low-dimensional structure are based on local linearity assumptions. However, clustering algorithms based on locally linear representations can tolerate difficult sampling conditions only to some extent, and may fail for scarcely sampled data manifolds or at high-curvature regions. In this paper, we consider a setting where each cluster is concentrated around a manifold and propose a manifold clustering algorithm that relies on the observation that the variation of the tangent space must be consistent along curves over the same data manifold. In order to achieve robustness against challenges due to noise, manifold intersections, and high curvature, we propose a progressive clustering approach: Observing the variation of the tangent space, we first detect the non-problematic manifold regions and form pre-clusters with the data samples belonging to such reliable regions. Next, these pre-clusters are merged together to form larger clusters with respect to constraints on both the distance and the tangent space variations. Finally, the samples identified as problematic are also assigned to the computed clusters to finalize the clustering. Experiments with synthetic and real datasets show that the proposed method outperforms the manifold clustering algorithms in comparison based on Euclidean distance and sparse representations.\",\"PeriodicalId\":6542,\"journal\":{\"name\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"volume\":\"10 1\",\"pages\":\"1-6\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLSP.2017.8168182\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP.2017.8168182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Progressive clustering of manifold-modeled data based on tangent space variations
An important research topic of the recent years has been to understand and analyze manifold-modeled data for clustering and classification applications. Most clustering methods developed for data of non-linear and low-dimensional structure are based on local linearity assumptions. However, clustering algorithms based on locally linear representations can tolerate difficult sampling conditions only to some extent, and may fail for scarcely sampled data manifolds or at high-curvature regions. In this paper, we consider a setting where each cluster is concentrated around a manifold and propose a manifold clustering algorithm that relies on the observation that the variation of the tangent space must be consistent along curves over the same data manifold. In order to achieve robustness against challenges due to noise, manifold intersections, and high curvature, we propose a progressive clustering approach: Observing the variation of the tangent space, we first detect the non-problematic manifold regions and form pre-clusters with the data samples belonging to such reliable regions. Next, these pre-clusters are merged together to form larger clusters with respect to constraints on both the distance and the tangent space variations. Finally, the samples identified as problematic are also assigned to the computed clusters to finalize the clustering. Experiments with synthetic and real datasets show that the proposed method outperforms the manifold clustering algorithms in comparison based on Euclidean distance and sparse representations.