Zihang Wang, Aoyun Geng, Junlin Xu, Yajie Meng, Zilong Zhang, Leyi Wei, Quan Zou, Feifei Cui
{"title":"A comprehensive review of computational methods for predicting DNA N<sup>4</sup>-methylcytosine sites.","authors":"Zihang Wang, Aoyun Geng, Junlin Xu, Yajie Meng, Zilong Zhang, Leyi Wei, Quan Zou, Feifei Cui","doi":"10.1016/j.ijbiomac.2025.148221","DOIUrl":null,"url":null,"abstract":"<p><p>N4-methylcytosine (4mC) is a distinct form of DNA methylation that plays a critical role in various biological processes by protecting bacterial DNA from degradation and participating in the regulation of gene expression. With advances in technology, computational approaches have increasingly replaced traditional experimental methods, which are often associated with high costs, prolonged processing times, and labor-intensive workflows. Over the past five years, a growing number of machine learning (ML) and deep learning (DL) models have been developed to predict 4mC sites. In this review, we provide a systematic overview of these computational methods, focusing on model architectures and comparing the strengths and limitations of ML- and DL-based approaches. To facilitate future tool development, we have collected and organized commonly used databases and benchmark datasets relevant to 4mC prediction. In addition, we compared several recently proposed methods to highlight their respective strengths and capabilities. Finally, we highlight the current challenges and opportunities in the field, aiming to facilitate the development of more accurate and robust predictive frameworks for 4mC methylation.</p>","PeriodicalId":333,"journal":{"name":"International Journal of Biological Macromolecules","volume":" ","pages":"148221"},"PeriodicalIF":8.5000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Biological Macromolecules","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1016/j.ijbiomac.2025.148221","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
N4-methylcytosine (4mC) is a distinct form of DNA methylation that plays a critical role in various biological processes by protecting bacterial DNA from degradation and participating in the regulation of gene expression. With advances in technology, computational approaches have increasingly replaced traditional experimental methods, which are often associated with high costs, prolonged processing times, and labor-intensive workflows. Over the past five years, a growing number of machine learning (ML) and deep learning (DL) models have been developed to predict 4mC sites. In this review, we provide a systematic overview of these computational methods, focusing on model architectures and comparing the strengths and limitations of ML- and DL-based approaches. To facilitate future tool development, we have collected and organized commonly used databases and benchmark datasets relevant to 4mC prediction. In addition, we compared several recently proposed methods to highlight their respective strengths and capabilities. Finally, we highlight the current challenges and opportunities in the field, aiming to facilitate the development of more accurate and robust predictive frameworks for 4mC methylation.
期刊介绍:
The International Journal of Biological Macromolecules is a well-established international journal dedicated to research on the chemical and biological aspects of natural macromolecules. Focusing on proteins, macromolecular carbohydrates, glycoproteins, proteoglycans, lignins, biological poly-acids, and nucleic acids, the journal presents the latest findings in molecular structure, properties, biological activities, interactions, modifications, and functional properties. Papers must offer new and novel insights, encompassing related model systems, structural conformational studies, theoretical developments, and analytical techniques. Each paper is required to primarily focus on at least one named biological macromolecule, reflected in the title, abstract, and text.