{"title":"ToxBERT: an explainable AI framework for enhancing prediction of adverse drug reactions and structural insights.","authors":"Yujie He, Xiang Lv, Wulin Long, Shengqiu Zhai, Menglong Li, Zhining Wen","doi":"10.1016/j.jpha.2025.101387","DOIUrl":null,"url":null,"abstract":"<p><p>Accurate prediction of drug-induced adverse drug reactions (ADRs) is crucial for drug safety evaluation, as it directly impacts public health and safety. While various models have shown promising results in predicting ADRs, their accuracy still needs improvement. Additionally, many existing models often lack interpretability when linking molecular structures to specific ADRs and frequently rely on manually selected molecular fingerprints, which can introduce bias. To address these challenges, we propose ToxBERT, an efficient transformer encoder model that leverages attention and masking mechanisms for simplified molecular input line entry system (SMILES) representations. Our results demonstrate that ToxBERT achieved area under the receiver operating characteristic curve (AUROC) scores of 0.839, 0.759, and 0.664 for predicting drug-induced QT prolongation (DIQT), rhabdomyolysis, and liver injury, respectively, outperforming previous studies. Furthermore, ToxBERT can identify drug substructures that are closely associated with specific ADRs. These findings indicate that ToxBERT is not only a valuable tool for understanding the mechanisms underlying specific drug-induced ADRs but also for mitigating potential ADRs in the drug discovery pipeline.</p>","PeriodicalId":94338,"journal":{"name":"Journal of pharmaceutical analysis","volume":"15 8","pages":"101387"},"PeriodicalIF":8.9000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446765/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of pharmaceutical analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jpha.2025.101387","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/3 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate prediction of drug-induced adverse drug reactions (ADRs) is crucial for drug safety evaluation, as it directly impacts public health and safety. While various models have shown promising results in predicting ADRs, their accuracy still needs improvement. Additionally, many existing models often lack interpretability when linking molecular structures to specific ADRs and frequently rely on manually selected molecular fingerprints, which can introduce bias. To address these challenges, we propose ToxBERT, an efficient transformer encoder model that leverages attention and masking mechanisms for simplified molecular input line entry system (SMILES) representations. Our results demonstrate that ToxBERT achieved area under the receiver operating characteristic curve (AUROC) scores of 0.839, 0.759, and 0.664 for predicting drug-induced QT prolongation (DIQT), rhabdomyolysis, and liver injury, respectively, outperforming previous studies. Furthermore, ToxBERT can identify drug substructures that are closely associated with specific ADRs. These findings indicate that ToxBERT is not only a valuable tool for understanding the mechanisms underlying specific drug-induced ADRs but also for mitigating potential ADRs in the drug discovery pipeline.