{"title":"Unveiling the genetic symphony: Deep learning for decoding promoters and non-promoters in DNA sequence","authors":"Mohamudha Parveen Rahamathulla , Shtwai Alsubai , Mohemmed Sha","doi":"10.1016/j.genrep.2025.102283","DOIUrl":null,"url":null,"abstract":"<div><div>Promoters are a significant area in the structure of DNA which is vital for the transcription of the exact gene in the genome. There are several types of promoters in the DNA that performs certain functions. The Mutations in the promoters are the major reason for many diseases like cancer, diabetes, etc. Effective identification of promoters is important for the diagnosis of certain diseases. The traditional identification of promoters is expensive and time-consuming. To resolve the issue, several conventional methods attempted to achieve better promoter and non-promoter identification systems but lack limitations like handling larger datasets, speed, and accuracy. Therefore, the projected system employs Bi-GRU (Bi-Gated Recurrent Units) with M-AM (Modified Attention Mechanism) to detect promoters and non-promoters in DNA sequence. Bi-GRU is utilized in the projected system for less complexity performance, better convergence speed, and the ability to work fast with huge data. Though it is widely utilized for detecting promoters and non-promoters, it has minor limitations, like handling larger datasets. To resolve this, the projected system utilizes Bi-GRU with M-AM to focus on the significant features of the data, which enhances the model's efficacy. M-AM of the numerical significances measure and sequence context weights. Promoters and non-promoters in the DNA sequence dataset are utilized in the projected system. Formerly, the performance of the projected system was calculated with evaluation metrics. Further, the proposed detection method is compared internally with LSTM (Long short-term Memory), BI-LSTM, and GRU to evaluate the model's efficiency. The projected detection of promoters and non-promoters is intended to assist researchers and physicians in genomics and molecular biology to enhance the diagnosis of the disease and disorders caused by the mutations of promoters.</div></div>","PeriodicalId":12673,"journal":{"name":"Gene Reports","volume":"40 ","pages":"Article 102283"},"PeriodicalIF":0.9000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gene Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452014425001566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Promoters are a significant area in the structure of DNA which is vital for the transcription of the exact gene in the genome. There are several types of promoters in the DNA that performs certain functions. The Mutations in the promoters are the major reason for many diseases like cancer, diabetes, etc. Effective identification of promoters is important for the diagnosis of certain diseases. The traditional identification of promoters is expensive and time-consuming. To resolve the issue, several conventional methods attempted to achieve better promoter and non-promoter identification systems but lack limitations like handling larger datasets, speed, and accuracy. Therefore, the projected system employs Bi-GRU (Bi-Gated Recurrent Units) with M-AM (Modified Attention Mechanism) to detect promoters and non-promoters in DNA sequence. Bi-GRU is utilized in the projected system for less complexity performance, better convergence speed, and the ability to work fast with huge data. Though it is widely utilized for detecting promoters and non-promoters, it has minor limitations, like handling larger datasets. To resolve this, the projected system utilizes Bi-GRU with M-AM to focus on the significant features of the data, which enhances the model's efficacy. M-AM of the numerical significances measure and sequence context weights. Promoters and non-promoters in the DNA sequence dataset are utilized in the projected system. Formerly, the performance of the projected system was calculated with evaluation metrics. Further, the proposed detection method is compared internally with LSTM (Long short-term Memory), BI-LSTM, and GRU to evaluate the model's efficiency. The projected detection of promoters and non-promoters is intended to assist researchers and physicians in genomics and molecular biology to enhance the diagnosis of the disease and disorders caused by the mutations of promoters.
Gene ReportsBiochemistry, Genetics and Molecular Biology-Genetics
CiteScore
3.30
自引率
7.70%
发文量
246
审稿时长
49 days
期刊介绍:
Gene Reports publishes papers that focus on the regulation, expression, function and evolution of genes in all biological contexts, including all prokaryotic and eukaryotic organisms, as well as viruses. Gene Reports strives to be a very diverse journal and topics in all fields will be considered for publication. Although not limited to the following, some general topics include: DNA Organization, Replication & Evolution -Focus on genomic DNA (chromosomal organization, comparative genomics, DNA replication, DNA repair, mobile DNA, mitochondrial DNA, chloroplast DNA). Expression & Function - Focus on functional RNAs (microRNAs, tRNAs, rRNAs, mRNA splicing, alternative polyadenylation) Regulation - Focus on processes that mediate gene-read out (epigenetics, chromatin, histone code, transcription, translation, protein degradation). Cell Signaling - Focus on mechanisms that control information flow into the nucleus to control gene expression (kinase and phosphatase pathways controlled by extra-cellular ligands, Wnt, Notch, TGFbeta/BMPs, FGFs, IGFs etc.) Profiling of gene expression and genetic variation - Focus on high throughput approaches (e.g., DeepSeq, ChIP-Seq, Affymetrix microarrays, proteomics) that define gene regulatory circuitry, molecular pathways and protein/protein networks. Genetics - Focus on development in model organisms (e.g., mouse, frog, fruit fly, worm), human genetic variation, population genetics, as well as agricultural and veterinary genetics. Molecular Pathology & Regenerative Medicine - Focus on the deregulation of molecular processes in human diseases and mechanisms supporting regeneration of tissues through pluripotent or multipotent stem cells.