Hongli Ma, Letian Gao, Yunfan Jin, Jianwei Ma, Yilan Bai, Xiaofan Liu, Pengfei Bao, Ke Liu, Zhenjiang Zech Xu, Zhi John Lu
{"title":"RNA-ligand interaction scoring via data perturbation and augmentation modeling.","authors":"Hongli Ma, Letian Gao, Yunfan Jin, Jianwei Ma, Yilan Bai, Xiaofan Liu, Pengfei Bao, Ke Liu, Zhenjiang Zech Xu, Zhi John Lu","doi":"10.1038/s43588-025-00820-x","DOIUrl":null,"url":null,"abstract":"<p><p>Despite recent advances in RNA-targeting drug discovery, the development of data-driven deep learning models remains challenging owing to limited validated RNA-small molecule interaction data and scarce known RNA structures. In this context, we introduce RNAsmol, a sequence-based deep learning framework that incorporates data perturbation with augmentation, graph-based molecular feature representation and attention-based feature fusion modules to predict RNA-small molecule interactions. RNAsmol employs perturbation strategies to balance the bias between the true negative and unknown interaction space, thereby elucidating the intrinsic binding patterns between RNA and small molecules. The resulting model demonstrates accurate predictions of the binding between RNA and small molecules, outperforming other methods in ten-fold cross-validation, unseen evaluation and decoy evaluation. Moreover, we use case studies to visualize molecular binding profiles and the distribution of learned weights, providing interpretable insights into RNAsmol's predictions. In particular, without requiring structural input, RNAsmol can generate reliable predictions and be adapted to various drug design scenarios.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":12.0000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s43588-025-00820-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Despite recent advances in RNA-targeting drug discovery, the development of data-driven deep learning models remains challenging owing to limited validated RNA-small molecule interaction data and scarce known RNA structures. In this context, we introduce RNAsmol, a sequence-based deep learning framework that incorporates data perturbation with augmentation, graph-based molecular feature representation and attention-based feature fusion modules to predict RNA-small molecule interactions. RNAsmol employs perturbation strategies to balance the bias between the true negative and unknown interaction space, thereby elucidating the intrinsic binding patterns between RNA and small molecules. The resulting model demonstrates accurate predictions of the binding between RNA and small molecules, outperforming other methods in ten-fold cross-validation, unseen evaluation and decoy evaluation. Moreover, we use case studies to visualize molecular binding profiles and the distribution of learned weights, providing interpretable insights into RNAsmol's predictions. In particular, without requiring structural input, RNAsmol can generate reliable predictions and be adapted to various drug design scenarios.