DNA的理化性质使外显子-内含子边界检测变得容易。

IF 3 4区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Molecular omics Pub Date : 2025-03-11 DOI:10.1039/D4MO00241E
Dinesh Sharma, Danish Aslam, Kopal Sharma, Aditya Mittal and B. Jayaram
{"title":"DNA的理化性质使外显子-内含子边界检测变得容易。","authors":"Dinesh Sharma, Danish Aslam, Kopal Sharma, Aditya Mittal and B. Jayaram","doi":"10.1039/D4MO00241E","DOIUrl":null,"url":null,"abstract":"<p >Genome architecture in eukaryotes exhibits a high degree of complexity. Amidst the numerous intricacies, the existence of genes as non-continuous stretches composed of exons and introns has garnered significant attention and curiosity among researchers. Accurate identification of exon–intron (EI) boundaries is crucial to decipher the molecular biology governing gene expression and regulation. This includes understanding both normal and aberrant splicing, with aberrant splicing referring to the abnormal processing of pre-mRNA that leads to improper inclusion or exclusion of exons or introns. Such splicing events can result in dysfunctional or non-functional proteins, which are often associated with various diseases. The currently employed frameworks for genomic signals, which aim to identify exons and introns within a genomic segment, need to be revised primarily due to the lack of a robust consensus sequence and the limitations posed by the training on available experimental datasets. To tackle these challenges and capitalize on the understanding that DNA exhibits function-dependent local physicochemical variations, we present ChemEXIN, an innovative novel method for predicting EI boundaries. The method utilizes a deep-learning (DL) architecture alongside tri- and tetra-nucleotide-based structural and energy features. ChemEXIN outperforms existing methods with notable accuracy and precision. It achieves an accuracy of 92.5% for humans, 79.9% for mice, and 92.0% for worms, along with precision values of 92.0%, 79.6%, and 91.8% for the same organisms, respectively. These results represent a significant advancement in EI boundary annotations, with potential implications for understanding gene expression, regulation, and cellular functions.</p>","PeriodicalId":19065,"journal":{"name":"Molecular omics","volume":" 3","pages":" 226-239"},"PeriodicalIF":3.0000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exon–intron boundary detection made easy by physicochemical properties of DNA†\",\"authors\":\"Dinesh Sharma, Danish Aslam, Kopal Sharma, Aditya Mittal and B. Jayaram\",\"doi\":\"10.1039/D4MO00241E\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Genome architecture in eukaryotes exhibits a high degree of complexity. Amidst the numerous intricacies, the existence of genes as non-continuous stretches composed of exons and introns has garnered significant attention and curiosity among researchers. Accurate identification of exon–intron (EI) boundaries is crucial to decipher the molecular biology governing gene expression and regulation. This includes understanding both normal and aberrant splicing, with aberrant splicing referring to the abnormal processing of pre-mRNA that leads to improper inclusion or exclusion of exons or introns. Such splicing events can result in dysfunctional or non-functional proteins, which are often associated with various diseases. The currently employed frameworks for genomic signals, which aim to identify exons and introns within a genomic segment, need to be revised primarily due to the lack of a robust consensus sequence and the limitations posed by the training on available experimental datasets. To tackle these challenges and capitalize on the understanding that DNA exhibits function-dependent local physicochemical variations, we present ChemEXIN, an innovative novel method for predicting EI boundaries. The method utilizes a deep-learning (DL) architecture alongside tri- and tetra-nucleotide-based structural and energy features. ChemEXIN outperforms existing methods with notable accuracy and precision. It achieves an accuracy of 92.5% for humans, 79.9% for mice, and 92.0% for worms, along with precision values of 92.0%, 79.6%, and 91.8% for the same organisms, respectively. These results represent a significant advancement in EI boundary annotations, with potential implications for understanding gene expression, regulation, and cellular functions.</p>\",\"PeriodicalId\":19065,\"journal\":{\"name\":\"Molecular omics\",\"volume\":\" 3\",\"pages\":\" 226-239\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular omics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/mo/d4mo00241e\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular omics","FirstCategoryId":"99","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/mo/d4mo00241e","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

真核生物的基因组结构具有高度的复杂性。在众多的错综复杂中,基因作为由外显子和内含子组成的非连续延伸的存在引起了研究人员的极大关注和好奇。准确识别外显子-内含子(EI)边界对于破译控制基因表达和调控的分子生物学至关重要。这包括理解正常剪接和异常剪接,异常剪接是指前mrna的异常加工,导致外显子或内含子的不当包含或排除。这种剪接事件可导致功能失调或无功能的蛋白质,这通常与各种疾病有关。目前采用的基因组信号框架旨在识别基因组片段内的外显子和内含子,由于缺乏可靠的共识序列以及现有实验数据集的训练所带来的限制,需要对其进行修订。为了解决这些挑战,并利用DNA表现出功能依赖的局部物理化学变化的理解,我们提出了ChemEXIN,一种预测EI边界的创新方法。该方法利用深度学习(DL)架构以及基于三核苷酸和四核苷酸的结构和能量特征。ChemEXIN以显著的准确度和精密度优于现有的方法。它对人类、小鼠和蠕虫的准确率分别为92.5%、79.9%和92.0%,对同一生物的准确率分别为92.0%、79.6%和91.8%。这些结果代表了EI边界注释的重大进展,对理解基因表达、调控和细胞功能具有潜在的意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exon–intron boundary detection made easy by physicochemical properties of DNA†

Genome architecture in eukaryotes exhibits a high degree of complexity. Amidst the numerous intricacies, the existence of genes as non-continuous stretches composed of exons and introns has garnered significant attention and curiosity among researchers. Accurate identification of exon–intron (EI) boundaries is crucial to decipher the molecular biology governing gene expression and regulation. This includes understanding both normal and aberrant splicing, with aberrant splicing referring to the abnormal processing of pre-mRNA that leads to improper inclusion or exclusion of exons or introns. Such splicing events can result in dysfunctional or non-functional proteins, which are often associated with various diseases. The currently employed frameworks for genomic signals, which aim to identify exons and introns within a genomic segment, need to be revised primarily due to the lack of a robust consensus sequence and the limitations posed by the training on available experimental datasets. To tackle these challenges and capitalize on the understanding that DNA exhibits function-dependent local physicochemical variations, we present ChemEXIN, an innovative novel method for predicting EI boundaries. The method utilizes a deep-learning (DL) architecture alongside tri- and tetra-nucleotide-based structural and energy features. ChemEXIN outperforms existing methods with notable accuracy and precision. It achieves an accuracy of 92.5% for humans, 79.9% for mice, and 92.0% for worms, along with precision values of 92.0%, 79.6%, and 91.8% for the same organisms, respectively. These results represent a significant advancement in EI boundary annotations, with potential implications for understanding gene expression, regulation, and cellular functions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular omics
Molecular omics Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
5.40
自引率
3.40%
发文量
91
期刊介绍: Molecular Omics publishes high-quality research from across the -omics sciences. Topics include, but are not limited to: -omics studies to gain mechanistic insight into biological processes – for example, determining the mode of action of a drug or the basis of a particular phenotype, such as drought tolerance -omics studies for clinical applications with validation, such as finding biomarkers for diagnostics or potential new drug targets -omics studies looking at the sub-cellular make-up of cells – for example, the subcellular localisation of certain proteins or post-translational modifications or new imaging techniques -studies presenting new methods and tools to support omics studies, including new spectroscopic/chromatographic techniques, chip-based/array technologies and new classification/data analysis techniques. New methods should be proven and demonstrate an advance in the field. Molecular Omics only accepts articles of high importance and interest that provide significant new insight into important chemical or biological problems. This could be fundamental research that significantly increases understanding or research that demonstrates clear functional benefits. Papers reporting new results that could be routinely predicted, do not show a significant improvement over known research, or are of interest only to the specialist in the area are not suitable for publication in Molecular Omics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信