{"title":"表域半监督学习的渐进式特征升级","authors":"Morteza Mohammady Gharasuie, Fenjiao Wang","doi":"10.1109/ICKG55886.2022.00031","DOIUrl":null,"url":null,"abstract":"Recent semi-supervised and self-supervised methods have shown great success in the image and text domains by utilizing augmentation techniques. Despite such success, it is not easy to transfer this success to a tabular domain. The common transformations from image and language are not easily adaptable to tabular data containing different data types (continuous and categorical data). There are a few semi-supervised works on the tabular domain that have focused on proposing new augmentation techniques for tabular data. These approaches may have shown some improvement in datasets with low-cardinality in categorical data. However, the fundamental challenges have not been tackled. The proposed methods either do not apply to datasets with high-cardinality or do not use an efficient encoding of categorical data. We propose using conditional probability representation and an efficient progressively feature upgrading framework to effectively learn representations for tabular data in semi-supervised applications. The extensive experiments show the superior performance of the proposed framework and the potential application in semi-supervised settings.","PeriodicalId":278067,"journal":{"name":"2022 IEEE International Conference on Knowledge Graph (ICKG)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Progressive Feature Upgrade in Semi-supervised Learning on Tabular Domain\",\"authors\":\"Morteza Mohammady Gharasuie, Fenjiao Wang\",\"doi\":\"10.1109/ICKG55886.2022.00031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent semi-supervised and self-supervised methods have shown great success in the image and text domains by utilizing augmentation techniques. Despite such success, it is not easy to transfer this success to a tabular domain. The common transformations from image and language are not easily adaptable to tabular data containing different data types (continuous and categorical data). There are a few semi-supervised works on the tabular domain that have focused on proposing new augmentation techniques for tabular data. These approaches may have shown some improvement in datasets with low-cardinality in categorical data. However, the fundamental challenges have not been tackled. The proposed methods either do not apply to datasets with high-cardinality or do not use an efficient encoding of categorical data. We propose using conditional probability representation and an efficient progressively feature upgrading framework to effectively learn representations for tabular data in semi-supervised applications. The extensive experiments show the superior performance of the proposed framework and the potential application in semi-supervised settings.\",\"PeriodicalId\":278067,\"journal\":{\"name\":\"2022 IEEE International Conference on Knowledge Graph (ICKG)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Knowledge Graph (ICKG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICKG55886.2022.00031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Knowledge Graph (ICKG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKG55886.2022.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Progressive Feature Upgrade in Semi-supervised Learning on Tabular Domain
Recent semi-supervised and self-supervised methods have shown great success in the image and text domains by utilizing augmentation techniques. Despite such success, it is not easy to transfer this success to a tabular domain. The common transformations from image and language are not easily adaptable to tabular data containing different data types (continuous and categorical data). There are a few semi-supervised works on the tabular domain that have focused on proposing new augmentation techniques for tabular data. These approaches may have shown some improvement in datasets with low-cardinality in categorical data. However, the fundamental challenges have not been tackled. The proposed methods either do not apply to datasets with high-cardinality or do not use an efficient encoding of categorical data. We propose using conditional probability representation and an efficient progressively feature upgrading framework to effectively learn representations for tabular data in semi-supervised applications. The extensive experiments show the superior performance of the proposed framework and the potential application in semi-supervised settings.