Deep Learning Models for Predicting Phenotypic Traits and Diseases from Omics Data

Md. Mohaiminul Islam, Yang Wang, P. Hu
{"title":"Deep Learning Models for Predicting Phenotypic Traits and Diseases from Omics Data","authors":"Md. Mohaiminul Islam, Yang Wang, P. Hu","doi":"10.5772/INTECHOPEN.75311","DOIUrl":null,"url":null,"abstract":"Computational analysis of high-throughput omics data, such as gene expressions, copy number alterations and DNA methylation (DNAm), has become popular in disease studies in recent decades because such analyses can be very helpful to pre- dict whether a patient has certain disease or its subtypes. However, due to the high-dimensional nature of the data sets with hundreds of thousands of variables and very small number of samples, traditional machine learning approaches, such as support vector machines (SVMs) and random forests, have limitations to analyze these data efficiently. In this chapter, we reviewed the progress in applying deep learning algo rithms to solve some biological questions. The focus is on potential software tools and public data sources for the tasks. Particularly, we show some case studies using deep neural network (DNN) models for classifying molecular subtypes of breast cancer and DNN-based regression models to account for interindividual variation in triglyceride concentrations measured at different visits of peripheral blood samples using DNAm profiles. We show that integration of multi-omics profiles into DNN-based learning methods could improve the prediction of the molecular subtypes of breast cancer. We also demonstrate the superiority of our proposed DNN models over the SVM model for predicting triglyceride concentrations. brief","PeriodicalId":442318,"journal":{"name":"Artificial Intelligence - Emerging Trends and Applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence - Emerging Trends and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5772/INTECHOPEN.75311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Computational analysis of high-throughput omics data, such as gene expressions, copy number alterations and DNA methylation (DNAm), has become popular in disease studies in recent decades because such analyses can be very helpful to pre- dict whether a patient has certain disease or its subtypes. However, due to the high-dimensional nature of the data sets with hundreds of thousands of variables and very small number of samples, traditional machine learning approaches, such as support vector machines (SVMs) and random forests, have limitations to analyze these data efficiently. In this chapter, we reviewed the progress in applying deep learning algo rithms to solve some biological questions. The focus is on potential software tools and public data sources for the tasks. Particularly, we show some case studies using deep neural network (DNN) models for classifying molecular subtypes of breast cancer and DNN-based regression models to account for interindividual variation in triglyceride concentrations measured at different visits of peripheral blood samples using DNAm profiles. We show that integration of multi-omics profiles into DNN-based learning methods could improve the prediction of the molecular subtypes of breast cancer. We also demonstrate the superiority of our proposed DNN models over the SVM model for predicting triglyceride concentrations. brief
从组学数据预测表型性状和疾病的深度学习模型
近几十年来,高通量组学数据的计算分析,如基因表达、拷贝数改变和DNA甲基化(DNAm),在疾病研究中已经变得很流行,因为这样的分析可以非常有助于预测患者是否患有某种疾病或其亚型。然而,由于数据集具有数十万个变量和非常少的样本的高维性质,传统的机器学习方法,如支持向量机(svm)和随机森林,在有效分析这些数据方面存在局限性。在这一章中,我们回顾了应用深度学习算法解决一些生物学问题的进展。重点是这些任务的潜在软件工具和公共数据源。特别是,我们展示了一些案例研究,使用深度神经网络(DNN)模型对乳腺癌的分子亚型进行分类,并使用基于DNN的回归模型来解释使用DNAm谱在不同访问外周血样本时测量的甘油三酯浓度的个体间差异。我们表明,将多组学图谱整合到基于dnn的学习方法中可以提高对乳腺癌分子亚型的预测。我们还证明了我们提出的DNN模型在预测甘油三酯浓度方面优于SVM模型。短暂的
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信