{"title":"基于概率聚类和邻居嵌入的数据可视化","authors":"Xiaohui Liao, Jingqi Yan","doi":"10.23919/CHICC.2018.8482651","DOIUrl":null,"url":null,"abstract":"In the era of information explosion, processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning. In order to obtain and intuitively understand the information underlying the big data, an effective visualization technique is on demand. Many successful visualization techniques project high-dimensional data sets into low-dimensional spaces so that we can present data points in scatter plots, histograms or parallel coordinate plots. In this paper, we propose a new algorithm called PCNE, the algorithm first performs a probabilistic clustering algorithm for coarse classification on the data sets, and then reconstruct the joint probability with the heuristic information of classification results and neighborhood relationship. Our experimental results on the public data sets demonstrate that the PCNE algorithm outperforms the classical embedding algorithms in revealing both local and global structures of data.","PeriodicalId":158442,"journal":{"name":"2018 37th Chinese Control Conference (CCC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Visualization with Probabilistic Clustering and Neighbor Embedding\",\"authors\":\"Xiaohui Liao, Jingqi Yan\",\"doi\":\"10.23919/CHICC.2018.8482651\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the era of information explosion, processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning. In order to obtain and intuitively understand the information underlying the big data, an effective visualization technique is on demand. Many successful visualization techniques project high-dimensional data sets into low-dimensional spaces so that we can present data points in scatter plots, histograms or parallel coordinate plots. In this paper, we propose a new algorithm called PCNE, the algorithm first performs a probabilistic clustering algorithm for coarse classification on the data sets, and then reconstruct the joint probability with the heuristic information of classification results and neighborhood relationship. Our experimental results on the public data sets demonstrate that the PCNE algorithm outperforms the classical embedding algorithms in revealing both local and global structures of data.\",\"PeriodicalId\":158442,\"journal\":{\"name\":\"2018 37th Chinese Control Conference (CCC)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 37th Chinese Control Conference (CCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/CHICC.2018.8482651\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 37th Chinese Control Conference (CCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CHICC.2018.8482651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Visualization with Probabilistic Clustering and Neighbor Embedding
In the era of information explosion, processing and analyzing large-scale and high-dimensional data sets has become a big challenge for data mining and machine learning. In order to obtain and intuitively understand the information underlying the big data, an effective visualization technique is on demand. Many successful visualization techniques project high-dimensional data sets into low-dimensional spaces so that we can present data points in scatter plots, histograms or parallel coordinate plots. In this paper, we propose a new algorithm called PCNE, the algorithm first performs a probabilistic clustering algorithm for coarse classification on the data sets, and then reconstruct the joint probability with the heuristic information of classification results and neighborhood relationship. Our experimental results on the public data sets demonstrate that the PCNE algorithm outperforms the classical embedding algorithms in revealing both local and global structures of data.