{"title":"通过DualNet的机器学习驱动的多组分药物固体形式的发现:盐和共晶的信心感知预测和排名","authors":"Mohammad Amin Ghanavati , Bahareh Khalili , Dino Alberico , Sohrab Rohani","doi":"10.1016/j.ijpharm.2025.126117","DOIUrl":null,"url":null,"abstract":"<div><div>Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.</div></div>","PeriodicalId":14187,"journal":{"name":"International Journal of Pharmaceutics","volume":"684 ","pages":"Article 126117"},"PeriodicalIF":5.2000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-driven discovery of multicomponent pharmaceutical solid forms via DualNet: confidence-aware prediction and ranking of salts and cocrystals\",\"authors\":\"Mohammad Amin Ghanavati , Bahareh Khalili , Dino Alberico , Sohrab Rohani\",\"doi\":\"10.1016/j.ijpharm.2025.126117\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.</div></div>\",\"PeriodicalId\":14187,\"journal\":{\"name\":\"International Journal of Pharmaceutics\",\"volume\":\"684 \",\"pages\":\"Article 126117\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Pharmaceutics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378517325009548\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pharmaceutics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378517325009548","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
Machine learning-driven discovery of multicomponent pharmaceutical solid forms via DualNet: confidence-aware prediction and ranking of salts and cocrystals
Salts and cocrystals are vital multicomponent entities for tuning pharmaceuticals’ solid-state properties, yet their experimental screening is labor-intensive and often inefficient. We introduce a DualNet Ensemble algorithm, a multi-class classification model that integrates molecular graph embeddings with curated physicochemical descriptors to predict the formation of salts, cocrystals, or physical mixtures while estimating predictive uncertainty. The proposed DualNet was trained on 70 % of a curated dataset containing 22,298 experimentally validated entries. Evaluated on a held-out 15 % test set, it achieved a macro-averaged Recall of 0.952 and F1-score of 0.940, demonstrating strong generalization. Additionally, it showed excellent calibration efficiency (ECE = 0.0161) and significantly outperformed the traditional empirical ΔpKa rule in distinguishing salts from cocrystals. The model demonstrated strong generalizability across six high-frequency compounds—salicylic acid, nicotinamide, succinic acid, benzoic acid, 2-butenoic acid, and oxalic acid—achieving a mean F1 score of 0.96. In a prospective ciprofloxacin case study, it successfully ranked three confirmed salts as the top three candidates and the only formed cocrystal as the fifth. The results clearly demonstrate the utility of the proposed methodology as a robust, interpretable, and experimentally reliable tool for accelerating multicomponent solid-form screening.
期刊介绍:
The International Journal of Pharmaceutics is the third most cited journal in the "Pharmacy & Pharmacology" category out of 366 journals, being the true home for pharmaceutical scientists concerned with the physical, chemical and biological properties of devices and delivery systems for drugs, vaccines and biologicals, including their design, manufacture and evaluation. This includes evaluation of the properties of drugs, excipients such as surfactants and polymers and novel materials. The journal has special sections on pharmaceutical nanotechnology and personalized medicines, and publishes research papers, reviews, commentaries and letters to the editor as well as special issues.