GNet: An integrated context-aware neural framework for transcription factor binding signal at single nucleotide resolution prediction.

IF 0.1 4区文学 0 LANGUAGE & LINGUISTICS

MODERN LANGUAGE REVIEW Pub Date : 2023-07-31 DOI:10.3934/mbe.2023704

Jujuan Zhuang, Kexin Feng, Xinyang Teng, Cangzhi Jia

{"title":"GNet: An integrated context-aware neural framework for transcription factor binding signal at single nucleotide resolution prediction.","authors":"Jujuan Zhuang, Kexin Feng, Xinyang Teng, Cangzhi Jia","doi":"10.3934/mbe.2023704","DOIUrl":null,"url":null,"abstract":"<p><p>Transcription factors (TFs) are important factors that regulate gene expression. Revealing the mechanism affecting the binding specificity of TFs is the key to understanding gene regulation. Most of the previous studies focus on TF-DNA binding sites at the sequence level, and they seldom utilize the contextual features of DNA sequences. In this paper, we develop an integrated spatiotemporal context-aware neural network framework, named GNet, for predicting TF-DNA binding signal at single nucleotide resolution by achieving three tasks: single nucleotide resolution signal prediction, identification of binding regions at the sequence level, and TF-DNA binding motif prediction. GNet extracts implicit spatial contextual information with a gated highway neural mechanism, which captures large context multi-level patterns using linear shortcut connections, and the idea of it permeates the encoder and decoder parts of GNet. The improved dual external attention mechanism, which learns implicit relationships both within and among samples, and improves the performance of the model. Experimental results on 53 human TF ChIP-seq datasets and 6 chromatin accessibility ATAC-seq datasets shows that GNet outperforms the state-of-the-art methods in the three tasks, and the results of cross-species studies on 15 human and 18 mouse TF datasets of the corresponding TF families indicate that GNet also shows the best performance in cross-species prediction over the competitive methods.</p>","PeriodicalId":45399,"journal":{"name":"MODERN LANGUAGE REVIEW","volume":"105 1","pages":"15809-15829"},"PeriodicalIF":0.1000,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MODERN LANGUAGE REVIEW","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3934/mbe.2023704","RegionNum":4,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Transcription factors (TFs) are important factors that regulate gene expression. Revealing the mechanism affecting the binding specificity of TFs is the key to understanding gene regulation. Most of the previous studies focus on TF-DNA binding sites at the sequence level, and they seldom utilize the contextual features of DNA sequences. In this paper, we develop an integrated spatiotemporal context-aware neural network framework, named GNet, for predicting TF-DNA binding signal at single nucleotide resolution by achieving three tasks: single nucleotide resolution signal prediction, identification of binding regions at the sequence level, and TF-DNA binding motif prediction. GNet extracts implicit spatial contextual information with a gated highway neural mechanism, which captures large context multi-level patterns using linear shortcut connections, and the idea of it permeates the encoder and decoder parts of GNet. The improved dual external attention mechanism, which learns implicit relationships both within and among samples, and improves the performance of the model. Experimental results on 53 human TF ChIP-seq datasets and 6 chromatin accessibility ATAC-seq datasets shows that GNet outperforms the state-of-the-art methods in the three tasks, and the results of cross-species studies on 15 human and 18 mouse TF datasets of the corresponding TF families indicate that GNet also shows the best performance in cross-species prediction over the competitive methods.

查看原文本刊更多论文

GNet:一个集成的上下文感知神经框架，用于单核苷酸分辨率预测转录因子结合信号

转录因子是调控基因表达的重要因子。揭示影响tf结合特异性的机制是理解基因调控的关键。以往的研究大多集中在序列水平上对TF-DNA结合位点的研究，很少利用DNA序列的上下文特征。在本文中，我们开发了一个集成的时空上下文感知神经网络框架，名为GNet，用于预测单核苷酸分辨率的TF-DNA结合信号，通过完成三个任务:单核苷酸分辨率信号预测，序列水平上的结合区域识别和TF-DNA结合基序预测。GNet采用门控高速公路神经机制提取隐式空间上下文信息，利用线性捷径连接捕获大上下文多层次模式，其思想渗透到GNet的编码器和解码器部分。改进的双重外部注意机制学习了样本内部和样本之间的隐式关系，提高了模型的性能。在53个人类TF ChIP-seq数据集和6个染色质可及性ATAC-seq数据集上的实验结果表明，GNet在这三个任务中的表现优于目前最先进的方法，在15个人类和18个小鼠TF数据集上的跨物种研究结果表明，GNet在跨物种预测方面也比竞争方法表现最好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

MODERN LANGUAGE REVIEW Multiple-

CiteScore

0.10

自引率

0.00%

发文量

157

期刊介绍： With an unbroken publication record since 1905, its 1248 pages are divided between articles, predominantly on medieval and modern literature, in the languages of continental Europe, together with English (including the United States and the Commonwealth), Francophone Africa and Canada, and Latin America. In addition, MLR reviews over five hundred books each year The MLR Supplement The Modern Language Review was founded in 1905 and has included well over 3,000 articles and some 20,000 book reviews. This supplement to Volume 100 is published by the Modern Humanities Research Association in celebration of the centenary of its flagship journal.