Yixing Jiang, Kristen Alford, Frank Ketchum, L. Tong, May D. Wang
{"title":"TLSurv","authors":"Yixing Jiang, Kristen Alford, Frank Ketchum, L. Tong, May D. Wang","doi":"10.1145/3388440.3412422","DOIUrl":null,"url":null,"abstract":"Lung cancer is one of the leading cancers, but survival models have not been explored to the extent of other cancers like breast cancer. In this study, we develop a super-hybrid network called TLSurv to integrate Copy Number Variation, DNA methylation, mRNA expression, and miRNA expression data for TCGA-LUAD datasets. The modularity of this super-hybrid network allows the integration of multiple -omics modalities with tremendous dimensional differences. Additionally, a novel training scheme called multi-stage transfer learning is used to train this super-hybrid network incrementally. This allows for training of a large network with many subnetworks using a relatively small data sets. At each stage, a shallow subnetwork is trained and these networks are combined to form a powerful prediction network. The results show the combination of DNA methylation data with either mRNA or miRNA expression data has produced promising performances with C-indexes of around 0.7. This performance is better than previous studies. Interpretability analysis confirms the clinical significance of some biomarkers identified. In addition, some novel biomarkers are suggested for future medical research. These findings reveal the potential of super-hybrid network for integrating multiple data modalities and the potential of multi-stage transfer learning for addressing the \"curse of dimensionality.\"","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3412422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Lung cancer is one of the leading cancers, but survival models have not been explored to the extent of other cancers like breast cancer. In this study, we develop a super-hybrid network called TLSurv to integrate Copy Number Variation, DNA methylation, mRNA expression, and miRNA expression data for TCGA-LUAD datasets. The modularity of this super-hybrid network allows the integration of multiple -omics modalities with tremendous dimensional differences. Additionally, a novel training scheme called multi-stage transfer learning is used to train this super-hybrid network incrementally. This allows for training of a large network with many subnetworks using a relatively small data sets. At each stage, a shallow subnetwork is trained and these networks are combined to form a powerful prediction network. The results show the combination of DNA methylation data with either mRNA or miRNA expression data has produced promising performances with C-indexes of around 0.7. This performance is better than previous studies. Interpretability analysis confirms the clinical significance of some biomarkers identified. In addition, some novel biomarkers are suggested for future medical research. These findings reveal the potential of super-hybrid network for integrating multiple data modalities and the potential of multi-stage transfer learning for addressing the "curse of dimensionality."