{"title":"An Unsupervised Feature Learning Method for Enhancing the Generalization of Cancer Diagnosis","authors":"Zhen Liu, Ruoyu Wang, Wen-bo Zhang, Deyu Tang","doi":"10.1145/3457682.3457720","DOIUrl":null,"url":null,"abstract":"Machine learning techniques have been utilized on gene expression profiling for cancer diagnosis. However, the gene expression data suffer from the curse of high dimensionality. Different kinds of feature selection methods were proposed to decrease the features of specific cancer diagnosis. As the difficult of obtaining the samples of a particular tumor, the lack of training samples leads to the overfitting problem. To handle the two problems, this paper proposes an unsupervised feature learning method. This method is able to enhance the performance of unsupervised feature learning by leveraging the unlabeled samples from other sources. Since the method utilizes the knowledge among the expression data from different sources, it can boost cancer classification performance. The experimental results on the gene expression data proves that our method improves the generalization cancer diagnosis when the unlabeled data are used for unsupervised feature learning.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Machine learning techniques have been utilized on gene expression profiling for cancer diagnosis. However, the gene expression data suffer from the curse of high dimensionality. Different kinds of feature selection methods were proposed to decrease the features of specific cancer diagnosis. As the difficult of obtaining the samples of a particular tumor, the lack of training samples leads to the overfitting problem. To handle the two problems, this paper proposes an unsupervised feature learning method. This method is able to enhance the performance of unsupervised feature learning by leveraging the unlabeled samples from other sources. Since the method utilizes the knowledge among the expression data from different sources, it can boost cancer classification performance. The experimental results on the gene expression data proves that our method improves the generalization cancer diagnosis when the unlabeled data are used for unsupervised feature learning.