Fundamentals for predicting transcriptional regulations from DNA sequence patterns

IF 2.6 3区 生物学 Q2 GENETICS & HEREDITY
Masaru Koido, Kohei Tomizuka, Chikashi Terao
{"title":"Fundamentals for predicting transcriptional regulations from DNA sequence patterns","authors":"Masaru Koido, Kohei Tomizuka, Chikashi Terao","doi":"10.1038/s10038-024-01256-3","DOIUrl":null,"url":null,"abstract":"Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements’ activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.","PeriodicalId":16077,"journal":{"name":"Journal of Human Genetics","volume":null,"pages":null},"PeriodicalIF":2.6000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s10038-024-01256-3.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Human Genetics","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s10038-024-01256-3","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Cell-type-specific regulatory elements, cataloged through extensive experiments and bioinformatics in large-scale consortiums, have enabled enrichment analyses of genetic associations that primarily utilize positional information of the regulatory elements. These analyses have identified cell types and pathways genetically associated with human complex traits. However, our understanding of detailed allelic effects on these elements’ activities and on-off states remains incomplete, hampering the interpretation of human genetic study results. This review introduces machine learning methods to learn sequence-dependent transcriptional regulation mechanisms from DNA sequences for predicting such allelic effects (not associations). We provide a concise history of machine-learning-based approaches, the requirements, and the key computational processes, focusing on primers in machine learning. Convolution and self-attention, pivotal in modern deep-learning models, are explained through geometrical interpretations using dot products. This facilitates understanding of the concept and why these have been used for machine learning for DNA sequences. These will inspire further research in this genetics and genomics field.

Abstract Image

Abstract Image

从 DNA 序列模式预测转录调控的基本原理。
通过大规模联盟的广泛实验和生物信息学编目,细胞类型特异性调控元件得以主要利用调控元件的位置信息对遗传关联进行富集分析。这些分析确定了与人类复杂性状相关的细胞类型和遗传途径。然而,我们对等位基因对这些元件的活动和通断状态的详细影响的了解仍然不全面,这妨碍了对人类基因研究结果的解释。本综述介绍了从 DNA 序列中学习序列依赖性转录调控机制的机器学习方法,以预测此类等位基因效应(非关联)。我们简明扼要地介绍了基于机器学习的方法的历史、要求和关键计算过程,重点介绍了机器学习的引子。卷积和自注意是现代深度学习模型的关键,我们通过点积的几何解释对其进行了说明。这有助于理解这一概念,以及为什么这些概念被用于 DNA 序列的机器学习。这些都将激励人们在这一遗传学和基因组学领域开展进一步的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Human Genetics
Journal of Human Genetics 生物-遗传学
CiteScore
7.20
自引率
0.00%
发文量
101
审稿时长
4-8 weeks
期刊介绍: The Journal of Human Genetics is an international journal publishing articles on human genetics, including medical genetics and human genome analysis. It covers all aspects of human genetics, including molecular genetics, clinical genetics, behavioral genetics, immunogenetics, pharmacogenomics, population genetics, functional genomics, epigenetics, genetic counseling and gene therapy. Articles on the following areas are especially welcome: genetic factors of monogenic and complex disorders, genome-wide association studies, genetic epidemiology, cancer genetics, personal genomics, genotype-phenotype relationships and genome diversity.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信