{"title":"一种新的UMAP降维框架提高小样本时效性","authors":"Nannan Dong, Jia-Ping Cheng, Jiazheng Lv, Xudong Zhong","doi":"10.1109/ICICA56942.2022.00011","DOIUrl":null,"url":null,"abstract":"UMAP (Uniform Manifold Approximation and Projection) is a fantastic non-linear dimension reduction method, having the capability of quickly processing large datasets. However, it is challenging to balance the timeliness and accuracy when reducing the dimension of the datasets with small samples and noise. To further enhance its timeliness, we propose a novel dimension reduction framework based on UMAP by introducing information entropy and LRR (Low-Rank Representation). We firstly perform LRR on the small sample dataset to remove noise. Besides, we innovatively calculate the entropy threshold with the entropy weight of each data feature to select valuable features. Finally, the dimension of the dataset with valuable features is reduced by UMAP. The datasets generated by us and several UCI datasets are employed to verify that the proposed framework is feasible and effective.","PeriodicalId":340745,"journal":{"name":"2022 11th International Conference on Information Communication and Applications (ICICA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Dimension Reduction Framework Based on UMAP for Improving the Timeliness of Small Samples\",\"authors\":\"Nannan Dong, Jia-Ping Cheng, Jiazheng Lv, Xudong Zhong\",\"doi\":\"10.1109/ICICA56942.2022.00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"UMAP (Uniform Manifold Approximation and Projection) is a fantastic non-linear dimension reduction method, having the capability of quickly processing large datasets. However, it is challenging to balance the timeliness and accuracy when reducing the dimension of the datasets with small samples and noise. To further enhance its timeliness, we propose a novel dimension reduction framework based on UMAP by introducing information entropy and LRR (Low-Rank Representation). We firstly perform LRR on the small sample dataset to remove noise. Besides, we innovatively calculate the entropy threshold with the entropy weight of each data feature to select valuable features. Finally, the dimension of the dataset with valuable features is reduced by UMAP. The datasets generated by us and several UCI datasets are employed to verify that the proposed framework is feasible and effective.\",\"PeriodicalId\":340745,\"journal\":{\"name\":\"2022 11th International Conference on Information Communication and Applications (ICICA)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 11th International Conference on Information Communication and Applications (ICICA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICA56942.2022.00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference on Information Communication and Applications (ICICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICA56942.2022.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Novel Dimension Reduction Framework Based on UMAP for Improving the Timeliness of Small Samples
UMAP (Uniform Manifold Approximation and Projection) is a fantastic non-linear dimension reduction method, having the capability of quickly processing large datasets. However, it is challenging to balance the timeliness and accuracy when reducing the dimension of the datasets with small samples and noise. To further enhance its timeliness, we propose a novel dimension reduction framework based on UMAP by introducing information entropy and LRR (Low-Rank Representation). We firstly perform LRR on the small sample dataset to remove noise. Besides, we innovatively calculate the entropy threshold with the entropy weight of each data feature to select valuable features. Finally, the dimension of the dataset with valuable features is reduced by UMAP. The datasets generated by us and several UCI datasets are employed to verify that the proposed framework is feasible and effective.