通过DualNet的机器学习驱动的多组分药物固体形式的发现:盐和共晶的信心感知预测和排名

IF 5.2 2区 医学 Q1 PHARMACOLOGY & PHARMACY
Mohammad Amin Ghanavati , Bahareh Khalili , Dino Alberico , Sohrab Rohani
{"title":"通过DualNet的机器学习驱动的多组分药物固体形式的发现:盐和共晶的信心感知预测和排名","authors":"Mohammad Amin Ghanavati ,&nbsp;Bahareh Khalili ,&nbsp;Dino Alberico ,&nbsp;Sohrab Rohani","doi":"10.1016/j.ijpharm.2025.126117","DOIUrl":null,"url":null,"abstract":"<div><div>Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.</div></div>","PeriodicalId":14187,"journal":{"name":"International Journal of Pharmaceutics","volume":"684 ","pages":"Article 126117"},"PeriodicalIF":5.2000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-driven discovery of multicomponent pharmaceutical solid forms via DualNet: confidence-aware prediction and ranking of salts and cocrystals\",\"authors\":\"Mohammad Amin Ghanavati ,&nbsp;Bahareh Khalili ,&nbsp;Dino Alberico ,&nbsp;Sohrab Rohani\",\"doi\":\"10.1016/j.ijpharm.2025.126117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.</div></div>\",\"PeriodicalId\":14187,\"journal\":{\"name\":\"International Journal of Pharmaceutics\",\"volume\":\"684 \",\"pages\":\"Article 126117\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Pharmaceutics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378517325009548\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pharmaceutics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378517325009548","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

摘要

盐和共晶是调节药物固态特性的重要多组分实体,但它们的实验筛选是劳动密集型的,往往效率低下。我们引入了一种DualNet集成算法,这是一种多类分类模型,它将分子图嵌入与策划的物理化学描述符集成在一起,在估计预测不确定性的同时预测盐、共晶或物理混合物的形成。提出的DualNet在包含22298个实验验证条目的策划数据集的70%上进行了训练。在15%的测试集上进行评估,其宏观平均召回率为0.952,f1得分为0.940,显示出很强的泛化性。此外,该方法具有良好的校准效率(ECE = 0.0161),在区分盐和共晶方面明显优于传统的经验ΔpKa规则。该模型对水杨酸、烟酰胺、琥珀酸、苯甲酸、2-丁烯酸和草酸这六种高频化合物具有很强的通用性,平均F1得分为0.96。在一项前瞻性环丙沙星案例研究中,它成功地将三种已确认的盐列为前三名候选盐,并将唯一形成的共晶列为第五名。结果清楚地表明,所提出的方法是一种强大的、可解释的、实验上可靠的工具,可用于加速多组分固体形式筛选。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Machine learning-driven discovery of multicomponent pharmaceutical solid forms via DualNet: confidence-aware prediction and ranking of salts and cocrystals

Machine learning-driven discovery of multicomponent pharmaceutical solid forms via DualNet: confidence-aware prediction and ranking of salts and cocrystals
Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
10.70
自引率
8.60%
发文量
951
审稿时长
72 days
期刊介绍: The International Journal of Pharmaceutics is the third most cited journal in the "Pharmacy & Pharmacology" category out of 366 journals, being the true home for pharmaceutical scientists concerned with the physical, chemical and biological properties of devices and delivery systems for drugs, vaccines and biologicals, including their design, manufacture and evaluation. This includes evaluation of the properties of drugs, excipients such as surfactants and polymers and novel materials. The journal has special sections on pharmaceutical nanotechnology and personalized medicines, and publishes research papers, reviews, commentaries and letters to the editor as well as special issues.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信