一种识别疾病特异性基因邻域的多上下文特征学习方法

S. Ghandikota, A. Jegga
{"title":"一种识别疾病特异性基因邻域的多上下文特征学习方法","authors":"S. Ghandikota, A. Jegga","doi":"10.1145/3388440.3412419","DOIUrl":null,"url":null,"abstract":"Analyzing gene networks in a specific phenotype state can provide important insights into pathways and biological processes underlying the onset and progression of the disease. Specifically, analyzing gene neighborhoods around key disease-driver genes and transcription factors can lead to discovery of regulatory networks and novel therapeutic targets. Traditional methods to decipher these regulatory networks mostly rely on transcriptomic signals and do not incorporate the different functional contexts available, making them inadequate to model the inherently complex relationships between genes and their neighborhoods. We present a neural network-based representation learning framework which uses both co-expression and functional gene contexts to learn continuous gene representations. It can be used to extract distributed representations of genes in normal (e.g., control, wild-type, etc.) and perturbed states (e.g., disease, knockout, etc.) by integrating co-expressed gene pairs from multiple transcriptomic datasets. To show the utility of this approach, we trained our model on whole lung tissue transcriptomic studies of idiopathic pulmonary fibrosis (IPF) to generate disease-specific gene representations. We compare the gene features from our method with two other representation learning methods by generating and analyzing the regulatory gene neighborhoods of known transcription factors in the lung tissue. Using several TF-target gene set libraries, we show that the regulatory gene neighborhoods by our method are biologically relevant.","PeriodicalId":411338,"journal":{"name":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A multi-context feature learning approach to identify disease-specific gene neighborhoods\",\"authors\":\"S. Ghandikota, A. Jegga\",\"doi\":\"10.1145/3388440.3412419\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analyzing gene networks in a specific phenotype state can provide important insights into pathways and biological processes underlying the onset and progression of the disease. Specifically, analyzing gene neighborhoods around key disease-driver genes and transcription factors can lead to discovery of regulatory networks and novel therapeutic targets. Traditional methods to decipher these regulatory networks mostly rely on transcriptomic signals and do not incorporate the different functional contexts available, making them inadequate to model the inherently complex relationships between genes and their neighborhoods. We present a neural network-based representation learning framework which uses both co-expression and functional gene contexts to learn continuous gene representations. It can be used to extract distributed representations of genes in normal (e.g., control, wild-type, etc.) and perturbed states (e.g., disease, knockout, etc.) by integrating co-expressed gene pairs from multiple transcriptomic datasets. To show the utility of this approach, we trained our model on whole lung tissue transcriptomic studies of idiopathic pulmonary fibrosis (IPF) to generate disease-specific gene representations. We compare the gene features from our method with two other representation learning methods by generating and analyzing the regulatory gene neighborhoods of known transcription factors in the lung tissue. Using several TF-target gene set libraries, we show that the regulatory gene neighborhoods by our method are biologically relevant.\",\"PeriodicalId\":411338,\"journal\":{\"name\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3388440.3412419\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3388440.3412419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

分析特定表型状态下的基因网络可以为疾病发生和发展的途径和生物学过程提供重要的见解。具体来说,分析关键疾病驱动基因和转录因子周围的基因邻域可以发现调控网络和新的治疗靶点。破译这些调控网络的传统方法大多依赖于转录组信号,并且没有纳入可用的不同功能背景,这使得它们不足以模拟基因及其邻域之间固有的复杂关系。我们提出了一个基于神经网络的表征学习框架,该框架使用共表达和功能基因上下文来学习连续的基因表征。通过整合来自多个转录组数据集的共表达基因对,它可用于提取正常(例如,对照、野生型等)和扰动状态(例如,疾病、基因敲除等)下基因的分布表示。为了展示这种方法的实用性,我们对特发性肺纤维化(IPF)的全肺组织转录组学研究模型进行了训练,以生成疾病特异性基因表征。我们通过生成和分析肺组织中已知转录因子的调控基因邻域,将我们的方法与其他两种表征学习方法的基因特征进行比较。使用几个tf靶基因集文库,我们表明通过我们的方法调控基因邻域具有生物学相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A multi-context feature learning approach to identify disease-specific gene neighborhoods
Analyzing gene networks in a specific phenotype state can provide important insights into pathways and biological processes underlying the onset and progression of the disease. Specifically, analyzing gene neighborhoods around key disease-driver genes and transcription factors can lead to discovery of regulatory networks and novel therapeutic targets. Traditional methods to decipher these regulatory networks mostly rely on transcriptomic signals and do not incorporate the different functional contexts available, making them inadequate to model the inherently complex relationships between genes and their neighborhoods. We present a neural network-based representation learning framework which uses both co-expression and functional gene contexts to learn continuous gene representations. It can be used to extract distributed representations of genes in normal (e.g., control, wild-type, etc.) and perturbed states (e.g., disease, knockout, etc.) by integrating co-expressed gene pairs from multiple transcriptomic datasets. To show the utility of this approach, we trained our model on whole lung tissue transcriptomic studies of idiopathic pulmonary fibrosis (IPF) to generate disease-specific gene representations. We compare the gene features from our method with two other representation learning methods by generating and analyzing the regulatory gene neighborhoods of known transcription factors in the lung tissue. Using several TF-target gene set libraries, we show that the regulatory gene neighborhoods by our method are biologically relevant.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信