Xingze Fang , Yun Zuo , Youxu Tan , Xiangrong Liu , Xiangxiang Zeng , Zhaohong Deng
{"title":"PreAIS: Prediction of A-to-I editing sites based on DNN-CNN deep learning models","authors":"Xingze Fang , Yun Zuo , Youxu Tan , Xiangrong Liu , Xiangxiang Zeng , Zhaohong Deng","doi":"10.1016/j.compbiolchem.2025.108612","DOIUrl":null,"url":null,"abstract":"<div><div>Adenosine-to-inosine RNA editing is crucial in biological processes and diseases, making A-to-I site identification key for research and drug development. However, accurate identification remains challenging due to complexity, low accuracy, and poor generalization in current models. To overcome these, a deep learning model called PreAIS has been proposed for identifying A-to-I editing sites. PreAIS first employs K-mer algorithm for feature extraction, followed by DNN-CNN for model1 construction. Finally, the model1 was trained and evaluated using 10-fold cross-validation. Compared to state-of-the-art models, PreAIS(model1) demonstrated improvements of 3.01 %, 0.67 %, and 5.04 % in Accuracy (ACC), Specificity (Sp), and Sensitivity (Sn) on Dataset 1. Additionally, using a human A-to-I RNA editing site dataset validated by Sanger sequencing, PreAIS(model1) identified 55 out of 58 sites with 94.8 % accuracy, outperforming other classifiers. To further validate the model's generalization capability, Bi-profile Bayes features were extracted from Dataset 2 for model evaluation. While keeping other parameters unchanged, only the input dimensions were adjusted to construct model2. Results from the independent test set demonstrated that even on a different dataset, our model continued to exhibit superior performance, once again surpassing the current best predictive models. Additionally, CAM was employed to interpret the prediction of PreAIS. The predictive model PreAIS and the related dataset constructed in this study can be accessed on the following GitHub page: <span><span>https://github.com/xzfang00/PreAIS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"119 ","pages":"Article 108612"},"PeriodicalIF":3.1000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125002737","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Adenosine-to-inosine RNA editing is crucial in biological processes and diseases, making A-to-I site identification key for research and drug development. However, accurate identification remains challenging due to complexity, low accuracy, and poor generalization in current models. To overcome these, a deep learning model called PreAIS has been proposed for identifying A-to-I editing sites. PreAIS first employs K-mer algorithm for feature extraction, followed by DNN-CNN for model1 construction. Finally, the model1 was trained and evaluated using 10-fold cross-validation. Compared to state-of-the-art models, PreAIS(model1) demonstrated improvements of 3.01 %, 0.67 %, and 5.04 % in Accuracy (ACC), Specificity (Sp), and Sensitivity (Sn) on Dataset 1. Additionally, using a human A-to-I RNA editing site dataset validated by Sanger sequencing, PreAIS(model1) identified 55 out of 58 sites with 94.8 % accuracy, outperforming other classifiers. To further validate the model's generalization capability, Bi-profile Bayes features were extracted from Dataset 2 for model evaluation. While keeping other parameters unchanged, only the input dimensions were adjusted to construct model2. Results from the independent test set demonstrated that even on a different dataset, our model continued to exhibit superior performance, once again surpassing the current best predictive models. Additionally, CAM was employed to interpret the prediction of PreAIS. The predictive model PreAIS and the related dataset constructed in this study can be accessed on the following GitHub page: https://github.com/xzfang00/PreAIS.
期刊介绍:
Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered.
Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.