{"title":"Classification of Liver Cancer Subtypes Based on Hierarchical Integrated Stacked Autoencoder","authors":"Tiantian Zhang, Shuxu Zhao, Zhaoping Zhang","doi":"10.1145/3449301.3449316","DOIUrl":null,"url":null,"abstract":"The development of high-throughput sequencing technology provides an opportunity to obtain multi-omics data for liver cancer,However,omics data often comes from different platforms and has different attributes, it has the characteristics of high feature dimension and small sample size. This will increase the overfitting of the model and the imbalance of categories,and the cross-platform integration analysis of omics data will challenge the traditional data analysis methods. In this regard, the Hierarchical Integrated Stacked Encoder (HI-SAE) is proposed.which can achieve deeper feature learning and data integration while reducing the differences caused by the characteristics of the data itself. Finally,the integrated feature expression is used to identify the subtype of liver cancer by softmax classifier. Experiments show that the classification accuracy when using Hi-SAE method for feature learning is 3.7% higher than that when using PCA, and 7.6% higher than that when using NMF.","PeriodicalId":429684,"journal":{"name":"Proceedings of the 6th International Conference on Robotics and Artificial Intelligence","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Robotics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3449301.3449316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The development of high-throughput sequencing technology provides an opportunity to obtain multi-omics data for liver cancer,However,omics data often comes from different platforms and has different attributes, it has the characteristics of high feature dimension and small sample size. This will increase the overfitting of the model and the imbalance of categories,and the cross-platform integration analysis of omics data will challenge the traditional data analysis methods. In this regard, the Hierarchical Integrated Stacked Encoder (HI-SAE) is proposed.which can achieve deeper feature learning and data integration while reducing the differences caused by the characteristics of the data itself. Finally,the integrated feature expression is used to identify the subtype of liver cancer by softmax classifier. Experiments show that the classification accuracy when using Hi-SAE method for feature learning is 3.7% higher than that when using PCA, and 7.6% higher than that when using NMF.