Cell type-specific interpretation of noncoding variants using deep learning-based methods.

IF 11.8 2区 生物学 Q1 MULTIDISCIPLINARY SCIENCES
GigaScience Pub Date : 2023-03-20 Epub Date: 2023-03-27 DOI:10.1093/gigascience/giad015
Maria Sindeeva, Nikolay Chekanov, Manvel Avetisian, Tatiana I Shashkova, Nikita Baranov, Elian Malkin, Alexander Lapin, Olga Kardymon, Veniamin Fishman
{"title":"Cell type-specific interpretation of noncoding variants using deep learning-based methods.","authors":"Maria Sindeeva, Nikolay Chekanov, Manvel Avetisian, Tatiana I Shashkova, Nikita Baranov, Elian Malkin, Alexander Lapin, Olga Kardymon, Veniamin Fishman","doi":"10.1093/gigascience/giad015","DOIUrl":null,"url":null,"abstract":"<p><p>Interpretation of noncoding genomic variants is one of the most important challenges in human genetics. Machine learning methods have emerged recently as a powerful tool to solve this problem. State-of-the-art approaches allow prediction of transcriptional and epigenetic effects caused by noncoding mutations. However, these approaches require specific experimental data for training and cannot generalize across cell types where required features were not experimentally measured. We show here that available epigenetic characteristics of human cell types are extremely sparse, limiting those approaches that rely on specific epigenetic input. We propose a new neural network architecture, DeepCT, which can learn complex interconnections of epigenetic features and infer unmeasured data from any available input. Furthermore, we show that DeepCT can learn cell type-specific properties, build biologically meaningful vector representations of cell types, and utilize these representations to generate cell type-specific predictions of the effects of noncoding variations in the human genome.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":null,"pages":null},"PeriodicalIF":11.8000,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10041527/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giad015","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/3/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Interpretation of noncoding genomic variants is one of the most important challenges in human genetics. Machine learning methods have emerged recently as a powerful tool to solve this problem. State-of-the-art approaches allow prediction of transcriptional and epigenetic effects caused by noncoding mutations. However, these approaches require specific experimental data for training and cannot generalize across cell types where required features were not experimentally measured. We show here that available epigenetic characteristics of human cell types are extremely sparse, limiting those approaches that rely on specific epigenetic input. We propose a new neural network architecture, DeepCT, which can learn complex interconnections of epigenetic features and infer unmeasured data from any available input. Furthermore, we show that DeepCT can learn cell type-specific properties, build biologically meaningful vector representations of cell types, and utilize these representations to generate cell type-specific predictions of the effects of noncoding variations in the human genome.

Abstract Image

Abstract Image

Abstract Image

利用基于深度学习的方法对非编码变异进行细胞类型特异性解读。
解读非编码基因组变异是人类遗传学最重要的挑战之一。机器学习方法近来已成为解决这一问题的有力工具。最先进的方法可以预测非编码突变引起的转录和表观遗传效应。然而,这些方法需要特定的实验数据进行训练,无法在没有实验测量所需特征的细胞类型中进行推广。我们在此表明,人类细胞类型的表观遗传特征非常稀少,这限制了那些依赖于特定表观遗传输入的方法。我们提出了一种新的神经网络架构--DeepCT,它可以学习表观遗传特征的复杂互连,并从任何可用输入中推断出未测量的数据。此外,我们还展示了 DeepCT 可以学习细胞类型的特异性,构建具有生物学意义的细胞类型向量表征,并利用这些表征对人类基因组中的非编码变异的影响进行细胞类型特异性预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
GigaScience
GigaScience MULTIDISCIPLINARY SCIENCES-
CiteScore
15.50
自引率
1.10%
发文量
119
审稿时长
1 weeks
期刊介绍: GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信