Md Fahim Sultan , Tasmin Karim , Md Shazzad Hossain Shaon , Sayed Mehedi Azim , Iman Dehzangi , Mst Shapna Akter , Sobhy M. Ibrahim , Md Mamun Ali , Kawsar Ahmed , Francis M. Bui
{"title":"DHUpredET: A comparative computational approach for identification of dihydrouridine modification sites in RNA sequence","authors":"Md Fahim Sultan , Tasmin Karim , Md Shazzad Hossain Shaon , Sayed Mehedi Azim , Iman Dehzangi , Mst Shapna Akter , Sobhy M. Ibrahim , Md Mamun Ali , Kawsar Ahmed , Francis M. Bui","doi":"10.1016/j.ab.2025.115828","DOIUrl":null,"url":null,"abstract":"<div><div>Laboratory-based detection of D sites is laborious and expensive. In this study, we developed effective machine learning models employing efficient feature encoding methods to identify D sites. Initially, we explored various state-of-the-art feature encoding approaches and 30 machine learning techniques for each and selected the top eight models based on their independent testing and cross-validation outcomes. Finally, we developed DHUpredET using the extra tree classifier methods for predicting DHU sites. The DHUpredET model demonstrated balanced performance across all evaluation criteria, outperforming state-of-the-art models by 8 % and 14 % in terms of accuracy and sensitivity, respectively, on an independent test set. Further analysis revealed that the model achieved higher accuracy with position-specific two nucleotide (PS2) features, leading us to conclude that PS2 features are the best suited for the DHUpredET model. Therefore, our proposed model emerges as the most favorite choice for predicting D sites. In addition, we conducted an in-depth analysis of local features and identified a particularly significant attribute with a feature score of 0.035 for PS2_299 attributes. This tool holds immense promise as an advantageous instrument for accelerating the discovery of D modification sites, which contributes too many targeting therapeutic and understanding RNA structure.</div></div>","PeriodicalId":7830,"journal":{"name":"Analytical biochemistry","volume":"702 ","pages":"Article 115828"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytical biochemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003269725000661","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Laboratory-based detection of D sites is laborious and expensive. In this study, we developed effective machine learning models employing efficient feature encoding methods to identify D sites. Initially, we explored various state-of-the-art feature encoding approaches and 30 machine learning techniques for each and selected the top eight models based on their independent testing and cross-validation outcomes. Finally, we developed DHUpredET using the extra tree classifier methods for predicting DHU sites. The DHUpredET model demonstrated balanced performance across all evaluation criteria, outperforming state-of-the-art models by 8 % and 14 % in terms of accuracy and sensitivity, respectively, on an independent test set. Further analysis revealed that the model achieved higher accuracy with position-specific two nucleotide (PS2) features, leading us to conclude that PS2 features are the best suited for the DHUpredET model. Therefore, our proposed model emerges as the most favorite choice for predicting D sites. In addition, we conducted an in-depth analysis of local features and identified a particularly significant attribute with a feature score of 0.035 for PS2_299 attributes. This tool holds immense promise as an advantageous instrument for accelerating the discovery of D modification sites, which contributes too many targeting therapeutic and understanding RNA structure.
期刊介绍:
The journal''s title Analytical Biochemistry: Methods in the Biological Sciences declares its broad scope: methods for the basic biological sciences that include biochemistry, molecular genetics, cell biology, proteomics, immunology, bioinformatics and wherever the frontiers of research take the field.
The emphasis is on methods from the strictly analytical to the more preparative that would include novel approaches to protein purification as well as improvements in cell and organ culture. The actual techniques are equally inclusive ranging from aptamers to zymology.
The journal has been particularly active in:
-Analytical techniques for biological molecules-
Aptamer selection and utilization-
Biosensors-
Chromatography-
Cloning, sequencing and mutagenesis-
Electrochemical methods-
Electrophoresis-
Enzyme characterization methods-
Immunological approaches-
Mass spectrometry of proteins and nucleic acids-
Metabolomics-
Nano level techniques-
Optical spectroscopy in all its forms.
The journal is reluctant to include most drug and strictly clinical studies as there are more suitable publication platforms for these types of papers.