基于多组学数据集成的乳腺癌InClust 5预测深度学习方法

A. Alkhateeb, Li Zhou, A. Tabl, L. Rueda
{"title":"基于多组学数据集成的乳腺癌InClust 5预测深度学习方法","authors":"A. Alkhateeb, Li Zhou, A. Tabl, L. Rueda","doi":"10.1145/3388440.3415992","DOIUrl":null,"url":null,"abstract":"Breast cancer is the most common cancer among North American women and worldwide. In this paper, we present a deep learning model based on multiomics data integration to predict the five-year interval survival of breast cancer InClust 5. The data was selected from METABRIC dataset that contains three omic datasets: gene expression, copy number alteration (CNA), and clinical feature datasets. The model utilizes self-organizing map (SOM), which is an unsupervised method, to create an RGB to extract feature map for each omic to be the based for the convolution layer in the convolutional neural network CNN. In total, the model creates three CNN, one for each model. This method is the expansion of the iSOM-GSN model, where we create a feature map for each omic dataset instead of only one. The model incorporates the prediction of the three CNNs using an integration layer. The integration layer votes based on the prediction of the majority as the output of the model. The main contributions are 1) integrating multiomics data module, where the models learn from all the omic datasets. 2) a model to classify 1-a Dimensional sample vector using CNN. The results show high-performance measurement where the accuracy around 94 percent.","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Deep Learning Approach for Breast Cancer InClust 5 Prediction based on Multiomics Data Integration\",\"authors\":\"A. Alkhateeb, Li Zhou, A. Tabl, L. Rueda\",\"doi\":\"10.1145/3388440.3415992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is the most common cancer among North American women and worldwide. In this paper, we present a deep learning model based on multiomics data integration to predict the five-year interval survival of breast cancer InClust 5. The data was selected from METABRIC dataset that contains three omic datasets: gene expression, copy number alteration (CNA), and clinical feature datasets. The model utilizes self-organizing map (SOM), which is an unsupervised method, to create an RGB to extract feature map for each omic to be the based for the convolution layer in the convolutional neural network CNN. In total, the model creates three CNN, one for each model. This method is the expansion of the iSOM-GSN model, where we create a feature map for each omic dataset instead of only one. The model incorporates the prediction of the three CNNs using an integration layer. The integration layer votes based on the prediction of the majority as the output of the model. The main contributions are 1) integrating multiomics data module, where the models learn from all the omic datasets. 2) a model to classify 1-a Dimensional sample vector using CNN. The results show high-performance measurement where the accuracy around 94 percent.\",\"PeriodicalId\":411338,\"journal\":{\"name\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3388440.3415992\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3415992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

乳腺癌是北美和全世界妇女中最常见的癌症。在本文中,我们提出了一个基于多组学数据整合的深度学习模型来预测乳腺癌5年生存率。数据选自METABRIC数据集,该数据集包含三个组学数据集:基因表达、拷贝数改变(CNA)和临床特征数据集。该模型利用自组织映射(SOM)这一无监督方法创建RGB来提取每个组的特征映射,作为卷积神经网络CNN中卷积层的基础。总的来说,模型创建了三个CNN,每个模型一个。该方法是iSOM-GSN模型的扩展,我们为每个基因组数据集创建一个特征映射,而不是只有一个。该模型使用集成层将三个cnn的预测合并在一起。集成层根据大多数人的预测进行投票,作为模型的输出。主要贡献是1)集成多组学数据模块,其中模型从所有组学数据集学习。2)利用CNN对1-a维样本向量进行分类的模型。结果表明,测量精度在94%左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Learning Approach for Breast Cancer InClust 5 Prediction based on Multiomics Data Integration
Breast cancer is the most common cancer among North American women and worldwide. In this paper, we present a deep learning model based on multiomics data integration to predict the five-year interval survival of breast cancer InClust 5. The data was selected from METABRIC dataset that contains three omic datasets: gene expression, copy number alteration (CNA), and clinical feature datasets. The model utilizes self-organizing map (SOM), which is an unsupervised method, to create an RGB to extract feature map for each omic to be the based for the convolution layer in the convolutional neural network CNN. In total, the model creates three CNN, one for each model. This method is the expansion of the iSOM-GSN model, where we create a feature map for each omic dataset instead of only one. The model incorporates the prediction of the three CNNs using an integration layer. The integration layer votes based on the prediction of the majority as the output of the model. The main contributions are 1) integrating multiomics data module, where the models learn from all the omic datasets. 2) a model to classify 1-a Dimensional sample vector using CNN. The results show high-performance measurement where the accuracy around 94 percent.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信