Unveiling the genetic symphony: Deep learning for decoding promoters and non-promoters in DNA sequence

IF 0.9 Q4 GENETICS & HEREDITY
Mohamudha Parveen Rahamathulla , Shtwai Alsubai , Mohemmed Sha
{"title":"Unveiling the genetic symphony: Deep learning for decoding promoters and non-promoters in DNA sequence","authors":"Mohamudha Parveen Rahamathulla ,&nbsp;Shtwai Alsubai ,&nbsp;Mohemmed Sha","doi":"10.1016/j.genrep.2025.102283","DOIUrl":null,"url":null,"abstract":"<div><div>Promoters are a significant area in the structure of DNA which is vital for the transcription of the exact gene in the genome. There are several types of promoters in the DNA that performs certain functions. The Mutations in the promoters are the major reason for many diseases like cancer, diabetes, etc. Effective identification of promoters is important for the diagnosis of certain diseases. The traditional identification of promoters is expensive and time-consuming. To resolve the issue, several conventional methods attempted to achieve better promoter and non-promoter identification systems but lack limitations like handling larger datasets, speed, and accuracy. Therefore, the projected system employs Bi-GRU (Bi-Gated Recurrent Units) with M-AM (Modified Attention Mechanism) to detect promoters and non-promoters in DNA sequence. Bi-GRU is utilized in the projected system for less complexity performance, better convergence speed, and the ability to work fast with huge data. Though it is widely utilized for detecting promoters and non-promoters, it has minor limitations, like handling larger datasets. To resolve this, the projected system utilizes Bi-GRU with M-AM to focus on the significant features of the data, which enhances the model's efficacy. M-AM of the numerical significances measure and sequence context weights. Promoters and non-promoters in the DNA sequence dataset are utilized in the projected system. Formerly, the performance of the projected system was calculated with evaluation metrics. Further, the proposed detection method is compared internally with LSTM (Long short-term Memory), BI-LSTM, and GRU to evaluate the model's efficiency. The projected detection of promoters and non-promoters is intended to assist researchers and physicians in genomics and molecular biology to enhance the diagnosis of the disease and disorders caused by the mutations of promoters.</div></div>","PeriodicalId":12673,"journal":{"name":"Gene Reports","volume":"40 ","pages":"Article 102283"},"PeriodicalIF":0.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gene Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452014425001566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Promoters are a significant area in the structure of DNA which is vital for the transcription of the exact gene in the genome. There are several types of promoters in the DNA that performs certain functions. The Mutations in the promoters are the major reason for many diseases like cancer, diabetes, etc. Effective identification of promoters is important for the diagnosis of certain diseases. The traditional identification of promoters is expensive and time-consuming. To resolve the issue, several conventional methods attempted to achieve better promoter and non-promoter identification systems but lack limitations like handling larger datasets, speed, and accuracy. Therefore, the projected system employs Bi-GRU (Bi-Gated Recurrent Units) with M-AM (Modified Attention Mechanism) to detect promoters and non-promoters in DNA sequence. Bi-GRU is utilized in the projected system for less complexity performance, better convergence speed, and the ability to work fast with huge data. Though it is widely utilized for detecting promoters and non-promoters, it has minor limitations, like handling larger datasets. To resolve this, the projected system utilizes Bi-GRU with M-AM to focus on the significant features of the data, which enhances the model's efficacy. M-AM of the numerical significances measure and sequence context weights. Promoters and non-promoters in the DNA sequence dataset are utilized in the projected system. Formerly, the performance of the projected system was calculated with evaluation metrics. Further, the proposed detection method is compared internally with LSTM (Long short-term Memory), BI-LSTM, and GRU to evaluate the model's efficiency. The projected detection of promoters and non-promoters is intended to assist researchers and physicians in genomics and molecular biology to enhance the diagnosis of the disease and disorders caused by the mutations of promoters.
揭示基因交响乐:解码DNA序列中的启动子和非启动子的深度学习
启动子是DNA结构中的一个重要区域,对基因组中确切基因的转录至关重要。DNA中有几种类型的启动子执行某些功能。启动子的突变是许多疾病如癌症、糖尿病等的主要原因。有效识别启动子对某些疾病的诊断具有重要意义。传统的启动子识别方法既昂贵又耗时。为了解决这个问题,一些传统的方法试图实现更好的启动子和非启动子识别系统,但缺乏处理更大数据集、速度和准确性等限制。因此,该系统采用Bi-GRU(双门控循环单元)和M-AM(修正注意机制)来检测DNA序列中的启动子和非启动子。在投影系统中使用Bi-GRU具有复杂度低、收敛速度快、处理大数据速度快等优点。虽然它被广泛用于检测启动子和非启动子,但它有一些小的局限性,比如处理更大的数据集。为了解决这个问题,投影系统利用Bi-GRU和M-AM来关注数据的重要特征,从而提高了模型的有效性。M-AM的数值意义度量和序列上下文权重。在投影系统中利用DNA序列数据集中的启动子和非启动子。以前,计划系统的性能是用评价指标来计算的。进一步,将所提出的检测方法与LSTM (Long - short-term Memory)、BI-LSTM和GRU进行内部比较,以评估模型的有效性。启动子和非启动子的预期检测旨在帮助基因组学和分子生物学的研究人员和医生加强对启动子突变引起的疾病和失调的诊断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Gene Reports
Gene Reports Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
3.30
自引率
7.70%
发文量
246
审稿时长
49 days
期刊介绍: Gene Reports publishes papers that focus on the regulation, expression, function and evolution of genes in all biological contexts, including all prokaryotic and eukaryotic organisms, as well as viruses. Gene Reports strives to be a very diverse journal and topics in all fields will be considered for publication. Although not limited to the following, some general topics include: DNA Organization, Replication & Evolution -Focus on genomic DNA (chromosomal organization, comparative genomics, DNA replication, DNA repair, mobile DNA, mitochondrial DNA, chloroplast DNA). Expression & Function - Focus on functional RNAs (microRNAs, tRNAs, rRNAs, mRNA splicing, alternative polyadenylation) Regulation - Focus on processes that mediate gene-read out (epigenetics, chromatin, histone code, transcription, translation, protein degradation). Cell Signaling - Focus on mechanisms that control information flow into the nucleus to control gene expression (kinase and phosphatase pathways controlled by extra-cellular ligands, Wnt, Notch, TGFbeta/BMPs, FGFs, IGFs etc.) Profiling of gene expression and genetic variation - Focus on high throughput approaches (e.g., DeepSeq, ChIP-Seq, Affymetrix microarrays, proteomics) that define gene regulatory circuitry, molecular pathways and protein/protein networks. Genetics - Focus on development in model organisms (e.g., mouse, frog, fruit fly, worm), human genetic variation, population genetics, as well as agricultural and veterinary genetics. Molecular Pathology & Regenerative Medicine - Focus on the deregulation of molecular processes in human diseases and mechanisms supporting regeneration of tissues through pluripotent or multipotent stem cells.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信