Deep Learning Models for Predicting Phenotypic Traits and Diseases from Omics Data

Artificial Intelligence - Emerging Trends and Applications Pub Date : 2018-06-27 DOI:10.5772/INTECHOPEN.75311

Md. Mohaiminul Islam, Yang Wang, P. Hu

{"title":"Deep Learning Models for Predicting Phenotypic Traits and Diseases from Omics Data","authors":"Md. Mohaiminul Islam, Yang Wang, P. Hu","doi":"10.5772/INTECHOPEN.75311","DOIUrl":null,"url":null,"abstract":"Computational analysis of high-throughput omics data, such as gene expressions, copy number alterations and DNA methylation (DNAm), has become popular in disease studies in recent decades because such analyses can be very helpful to pre- dict whether a patient has certain disease or its subtypes. However, due to the high-dimensional nature of the data sets with hundreds of thousands of variables and very small number of samples, traditional machine learning approaches, such as support vector machines (SVMs) and random forests, have limitations to analyze these data efficiently. In this chapter, we reviewed the progress in applying deep learning algo rithms to solve some biological questions. The focus is on potential software tools and public data sources for the tasks. Particularly, we show some case studies using deep neural network (DNN) models for classifying molecular subtypes of breast cancer and DNN-based regression models to account for interindividual variation in triglyceride concentrations measured at different visits of peripheral blood samples using DNAm profiles. We show that integration of multi-omics profiles into DNN-based learning methods could improve the prediction of the molecular subtypes of breast cancer. We also demonstrate the superiority of our proposed DNN models over the SVM model for predicting triglyceride concentrations. brief","PeriodicalId":442318,"journal":{"name":"Artificial Intelligence - Emerging Trends and Applications","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence - Emerging Trends and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5772/INTECHOPEN.75311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Computational analysis of high-throughput omics data, such as gene expressions, copy number alterations and DNA methylation (DNAm), has become popular in disease studies in recent decades because such analyses can be very helpful to pre- dict whether a patient has certain disease or its subtypes. However, due to the high-dimensional nature of the data sets with hundreds of thousands of variables and very small number of samples, traditional machine learning approaches, such as support vector machines (SVMs) and random forests, have limitations to analyze these data efficiently. In this chapter, we reviewed the progress in applying deep learning algo rithms to solve some biological questions. The focus is on potential software tools and public data sources for the tasks. Particularly, we show some case studies using deep neural network (DNN) models for classifying molecular subtypes of breast cancer and DNN-based regression models to account for interindividual variation in triglyceride concentrations measured at different visits of peripheral blood samples using DNAm profiles. We show that integration of multi-omics profiles into DNN-based learning methods could improve the prediction of the molecular subtypes of breast cancer. We also demonstrate the superiority of our proposed DNN models over the SVM model for predicting triglyceride concentrations. brief

查看原文本刊更多论文

从组学数据预测表型性状和疾病的深度学习模型

近几十年来，高通量组学数据的计算分析，如基因表达、拷贝数改变和DNA甲基化(DNAm)，在疾病研究中已经变得很流行，因为这样的分析可以非常有助于预测患者是否患有某种疾病或其亚型。然而，由于数据集具有数十万个变量和非常少的样本的高维性质，传统的机器学习方法，如支持向量机(svm)和随机森林，在有效分析这些数据方面存在局限性。在这一章中，我们回顾了应用深度学习算法解决一些生物学问题的进展。重点是这些任务的潜在软件工具和公共数据源。特别是，我们展示了一些案例研究，使用深度神经网络(DNN)模型对乳腺癌的分子亚型进行分类，并使用基于DNN的回归模型来解释使用DNAm谱在不同访问外周血样本时测量的甘油三酯浓度的个体间差异。我们表明，将多组学图谱整合到基于dnn的学习方法中可以提高对乳腺癌分子亚型的预测。我们还证明了我们提出的DNN模型在预测甘油三酯浓度方面优于SVM模型。短暂的

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence - Emerging Trends and Applications

自引率

0.00%

发文量