Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression

Aram Ansary Ogholbake, Qiang Cheng
{"title":"Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression","authors":"Aram Ansary Ogholbake, Qiang Cheng","doi":"10.1007/s00521-024-10316-w","DOIUrl":null,"url":null,"abstract":"<p>Circadian rhythms are driven by an internal molecular clock which controls physiological and behavioral processes. Disruptions in these rhythms have been associated with health issues. Therefore, studying circadian rhythms is crucial for understanding physiology, behavior, and pathophysiology. However, it is challenging to study circadian rhythms over gene expression data, due to a scarcity of time labels. In this paper, we propose a novel approach to predict the phases of un-timed samples based on a deep neural network (DNN) architecture. This approach addresses two challenges: (1) prediction of sample phases and reliable identification of cyclic genes from high-dimensional expression data without relying on conserved circadian genes and (2) handling small sample-sized datasets. Our algorithm begins with initial gene screening to select candidate cyclic genes using a Minimum Distortion Embedding framework. This stage is then followed by greedy layer-wise pre-training of our DNN. Pre-training accomplishes two critical objectives: First, it initializes the hidden layers of our DNN model, enabling them to effectively capture features from the gene profiles with limited samples. Second, it provides suitable initial values for essential aspects of gene periodic oscillations. Subsequently, we fine-tune the pre-trained network to achieve precise sample phase predictions. Extensive experiments on both animal and human datasets show accurate and robust prediction of both sample phases and cyclic genes. Moreover, based on an Alzheimer’s disease (AD) dataset, we identify a set of hub genes that show significant oscillations in cognitively normal subjects but had disruptions in AD, as well as their potential therapeutic targets.</p>","PeriodicalId":18925,"journal":{"name":"Neural Computing and Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computing and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s00521-024-10316-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Circadian rhythms are driven by an internal molecular clock which controls physiological and behavioral processes. Disruptions in these rhythms have been associated with health issues. Therefore, studying circadian rhythms is crucial for understanding physiology, behavior, and pathophysiology. However, it is challenging to study circadian rhythms over gene expression data, due to a scarcity of time labels. In this paper, we propose a novel approach to predict the phases of un-timed samples based on a deep neural network (DNN) architecture. This approach addresses two challenges: (1) prediction of sample phases and reliable identification of cyclic genes from high-dimensional expression data without relying on conserved circadian genes and (2) handling small sample-sized datasets. Our algorithm begins with initial gene screening to select candidate cyclic genes using a Minimum Distortion Embedding framework. This stage is then followed by greedy layer-wise pre-training of our DNN. Pre-training accomplishes two critical objectives: First, it initializes the hidden layers of our DNN model, enabling them to effectively capture features from the gene profiles with limited samples. Second, it provides suitable initial values for essential aspects of gene periodic oscillations. Subsequently, we fine-tune the pre-trained network to achieve precise sample phase predictions. Extensive experiments on both animal and human datasets show accurate and robust prediction of both sample phases and cyclic genes. Moreover, based on an Alzheimer’s disease (AD) dataset, we identify a set of hub genes that show significant oscillations in cognitively normal subjects but had disruptions in AD, as well as their potential therapeutic targets.

Abstract Image

基因表达时钟:从全基因组表达预测昼夜节律的无监督深度学习方法
昼夜节律由控制生理和行为过程的内部分子钟驱动。这些节律的紊乱与健康问题有关。因此,研究昼夜节律对于了解生理、行为和病理生理学至关重要。然而,由于缺乏时间标签,在基因表达数据上研究昼夜节律具有挑战性。在本文中,我们提出了一种基于深度神经网络(DNN)架构的新方法来预测未定时样本的阶段。这种方法解决了两个难题:(1) 预测样本阶段,并从高维表达数据中可靠地识别周期基因,而无需依赖保守的昼夜节律基因;(2) 处理小样本量数据集。我们的算法从最初的基因筛选开始,利用最小失真嵌入(Minimum Distortion Embedding)框架选择候选循环基因。在这一阶段之后,我们将对 DNN 进行贪婪的分层预训练。预训练实现了两个关键目标:首先,它初始化了 DNN 模型的隐藏层,使其能够在样本有限的情况下有效捕捉基因图谱的特征。其次,它为基因周期性振荡的重要方面提供了合适的初始值。随后,我们对预训练网络进行微调,以实现精确的样本相位预测。在动物和人类数据集上进行的广泛实验表明,对样本相位和周期基因的预测都是准确而稳健的。此外,基于阿尔茨海默病(AD)数据集,我们确定了一组在认知正常的受试者中表现出显著振荡,但在 AD 中却出现紊乱的中枢基因,以及它们的潜在治疗靶点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信