{"title":"基于多组学整合的肺腺癌生存预测模型","authors":"Vidhi Malik, S. Dutta, Yogesh Kalakoti, D. Sundar","doi":"10.1109/GHCI47972.2019.9071831","DOIUrl":null,"url":null,"abstract":"Background: Lung adenocarcinoma (LUAD) patients majorly tend to poor clinical outcomes. A biomarker or gene signature built using multi-omics dataset along with clinical features that could predict survival in these patients would have a significant clinical impact, enabling earlier detection of mortality risk and personalized therapy. Methods: To identify a novel multi-omics signature along with clinical features associated with overall survival, we analyzed LUAD patient's single omics datasets for Copy number variations (CNV), protein, methylation, mutation, RNA, mi-RNA that were extracted from The Cancer Genome Atlas (TCGA). Neighborhood component analysis, a feature reduction algorithm was applied to the large feature space for all the single omics data set to select the optimal number of combinations of best feature predictors. These selected features for each singe omics dataset were coupled to integrate multiple inputs and fed into an Support vector machine (SVM), Neural network pattern recognizer and RUS ensemble boost to build the survival prediction model. An external cohort was used to validate the prediction models. Results: We identified a critical feature space for multi-omics-based integration that could effectively stratify these LUAD patients into our critical survival classes with 92.9% accuracy using our neural network-based model, and receiver operating characteristic (ROC) analysis indicated that the signature had a powerful predictive ability. Moreover, a predictive pipeline was established based on the above signature integrated with clinicopathological features. The performance in terms of prediction accuracy for single-omics data as input for validation was not as good as the performance of our model, as it requires multi-omics data as an input and improves performance accuracy of our classifier. Lastly, the signature was validated by an external cohort from excluded patients retrieved for Group I and II study on our best performing classifier, the neural network pattern recognizer. Conclusion: Finally, we developed a robust multi-omics signature as a self-sustaining factor to effectively classify LUAD patients into two survival classes, i.e., alive or dead with unprecedented accuracy of 92.9%, which might provide a basis for personalized treatments for these patients.","PeriodicalId":153240,"journal":{"name":"2019 Grace Hopper Celebration India (GHCI)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Multi-omics Integration based Predictive Model for Survival Prediction of Lung Adenocarcinaoma\",\"authors\":\"Vidhi Malik, S. Dutta, Yogesh Kalakoti, D. Sundar\",\"doi\":\"10.1109/GHCI47972.2019.9071831\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Lung adenocarcinoma (LUAD) patients majorly tend to poor clinical outcomes. A biomarker or gene signature built using multi-omics dataset along with clinical features that could predict survival in these patients would have a significant clinical impact, enabling earlier detection of mortality risk and personalized therapy. Methods: To identify a novel multi-omics signature along with clinical features associated with overall survival, we analyzed LUAD patient's single omics datasets for Copy number variations (CNV), protein, methylation, mutation, RNA, mi-RNA that were extracted from The Cancer Genome Atlas (TCGA). Neighborhood component analysis, a feature reduction algorithm was applied to the large feature space for all the single omics data set to select the optimal number of combinations of best feature predictors. These selected features for each singe omics dataset were coupled to integrate multiple inputs and fed into an Support vector machine (SVM), Neural network pattern recognizer and RUS ensemble boost to build the survival prediction model. An external cohort was used to validate the prediction models. Results: We identified a critical feature space for multi-omics-based integration that could effectively stratify these LUAD patients into our critical survival classes with 92.9% accuracy using our neural network-based model, and receiver operating characteristic (ROC) analysis indicated that the signature had a powerful predictive ability. Moreover, a predictive pipeline was established based on the above signature integrated with clinicopathological features. The performance in terms of prediction accuracy for single-omics data as input for validation was not as good as the performance of our model, as it requires multi-omics data as an input and improves performance accuracy of our classifier. Lastly, the signature was validated by an external cohort from excluded patients retrieved for Group I and II study on our best performing classifier, the neural network pattern recognizer. Conclusion: Finally, we developed a robust multi-omics signature as a self-sustaining factor to effectively classify LUAD patients into two survival classes, i.e., alive or dead with unprecedented accuracy of 92.9%, which might provide a basis for personalized treatments for these patients.\",\"PeriodicalId\":153240,\"journal\":{\"name\":\"2019 Grace Hopper Celebration India (GHCI)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Grace Hopper Celebration India (GHCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/GHCI47972.2019.9071831\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Grace Hopper Celebration India (GHCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GHCI47972.2019.9071831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-omics Integration based Predictive Model for Survival Prediction of Lung Adenocarcinaoma
Background: Lung adenocarcinoma (LUAD) patients majorly tend to poor clinical outcomes. A biomarker or gene signature built using multi-omics dataset along with clinical features that could predict survival in these patients would have a significant clinical impact, enabling earlier detection of mortality risk and personalized therapy. Methods: To identify a novel multi-omics signature along with clinical features associated with overall survival, we analyzed LUAD patient's single omics datasets for Copy number variations (CNV), protein, methylation, mutation, RNA, mi-RNA that were extracted from The Cancer Genome Atlas (TCGA). Neighborhood component analysis, a feature reduction algorithm was applied to the large feature space for all the single omics data set to select the optimal number of combinations of best feature predictors. These selected features for each singe omics dataset were coupled to integrate multiple inputs and fed into an Support vector machine (SVM), Neural network pattern recognizer and RUS ensemble boost to build the survival prediction model. An external cohort was used to validate the prediction models. Results: We identified a critical feature space for multi-omics-based integration that could effectively stratify these LUAD patients into our critical survival classes with 92.9% accuracy using our neural network-based model, and receiver operating characteristic (ROC) analysis indicated that the signature had a powerful predictive ability. Moreover, a predictive pipeline was established based on the above signature integrated with clinicopathological features. The performance in terms of prediction accuracy for single-omics data as input for validation was not as good as the performance of our model, as it requires multi-omics data as an input and improves performance accuracy of our classifier. Lastly, the signature was validated by an external cohort from excluded patients retrieved for Group I and II study on our best performing classifier, the neural network pattern recognizer. Conclusion: Finally, we developed a robust multi-omics signature as a self-sustaining factor to effectively classify LUAD patients into two survival classes, i.e., alive or dead with unprecedented accuracy of 92.9%, which might provide a basis for personalized treatments for these patients.