Srikanth Thudumu, P. Branch, Jiong Jin, Jugdutt Singh
{"title":"高维数据中局部相关子空间的估计","authors":"Srikanth Thudumu, P. Branch, Jiong Jin, Jugdutt Singh","doi":"10.1145/3373017.3373032","DOIUrl":null,"url":null,"abstract":"High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called “curse of dimensionality“. To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.","PeriodicalId":297760,"journal":{"name":"Proceedings of the Australasian Computer Science Week Multiconference","volume":"526 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Estimation of Locally Relevant Subspace in High-dimensional Data\",\"authors\":\"Srikanth Thudumu, P. Branch, Jiong Jin, Jugdutt Singh\",\"doi\":\"10.1145/3373017.3373032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called “curse of dimensionality“. To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.\",\"PeriodicalId\":297760,\"journal\":{\"name\":\"Proceedings of the Australasian Computer Science Week Multiconference\",\"volume\":\"526 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Australasian Computer Science Week Multiconference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3373017.3373032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Australasian Computer Science Week Multiconference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373017.3373032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Estimation of Locally Relevant Subspace in High-dimensional Data
High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called “curse of dimensionality“. To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.