Leqi Lin, Xingyu Zhou, Kaiyuan Yang, Yang Liu, Xizhong Chen
{"title":"DeepSeek-LLM with Adaptive RAG for Pharmaceutical Dissolution Prediction.","authors":"Leqi Lin, Xingyu Zhou, Kaiyuan Yang, Yang Liu, Xizhong Chen","doi":"10.1007/s11095-025-03932-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This work aims to accelerate and enhance pharmaceutical drug dissolution prediction by integrating advanced Large Language Models (LLMs) and AI-diffusion models to reduce reliance on time-consuming, costly empirical experiments. The framework sets a foundation for broader adoption of generative AI in drug development.</p><p><strong>Methods: </strong>This work introduces a DeepSeek based LLM framework augmented by prompt engineering (zero-shot, few-shot, chain-of-thought) and adaptive weighted retrieval-augmented generation (RAG) to systematize dissolution profile from basic physical properties. Moreover, a diffusion model synthesizes SEM-derived morphological parameters (e.g., particle size, surface area), circumventing error accumulation from multi-instrument characterization workflows. These parameters feed the RAG database, enabling LLM predictions grounded in structure-performance relationships rather than idealized assumptions.</p><p><strong>Results: </strong>Overall, the LLM generated dissolution profile (few-shot chain-of-thought with RAG) provides a good agreement between experimental and the prediction result among others. Sensitivity analysis is investigated to quantify the reliability and stability of the prompt content. Additionally, diffusion-generated structural data from SEM images combined with the LLM's predictive capabilities are tested to connect macro-scale physical properties with microstructural characteristics, achieving a close profile trend with acceptable RMSE and PCC.</p><p><strong>Conclusions: </strong>This study demonstrates the potential of the DeepSeek-based LLM framework to describe the dissolution of drug powders. Among the different system prompt strategies, few-shot chain-of-thought with RAG performs the best dissolution profile among others. While it may overcomplicate straightforward tasks in certain scenarios. The combination of diffusion models successfully bridges AI-driven insights (e.g., dissolution predictions) with physical and structural drug properties (e.g., particle geometry from SEM images).</p>","PeriodicalId":20027,"journal":{"name":"Pharmaceutical Research","volume":" ","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pharmaceutical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11095-025-03932-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This work aims to accelerate and enhance pharmaceutical drug dissolution prediction by integrating advanced Large Language Models (LLMs) and AI-diffusion models to reduce reliance on time-consuming, costly empirical experiments. The framework sets a foundation for broader adoption of generative AI in drug development.
Methods: This work introduces a DeepSeek based LLM framework augmented by prompt engineering (zero-shot, few-shot, chain-of-thought) and adaptive weighted retrieval-augmented generation (RAG) to systematize dissolution profile from basic physical properties. Moreover, a diffusion model synthesizes SEM-derived morphological parameters (e.g., particle size, surface area), circumventing error accumulation from multi-instrument characterization workflows. These parameters feed the RAG database, enabling LLM predictions grounded in structure-performance relationships rather than idealized assumptions.
Results: Overall, the LLM generated dissolution profile (few-shot chain-of-thought with RAG) provides a good agreement between experimental and the prediction result among others. Sensitivity analysis is investigated to quantify the reliability and stability of the prompt content. Additionally, diffusion-generated structural data from SEM images combined with the LLM's predictive capabilities are tested to connect macro-scale physical properties with microstructural characteristics, achieving a close profile trend with acceptable RMSE and PCC.
Conclusions: This study demonstrates the potential of the DeepSeek-based LLM framework to describe the dissolution of drug powders. Among the different system prompt strategies, few-shot chain-of-thought with RAG performs the best dissolution profile among others. While it may overcomplicate straightforward tasks in certain scenarios. The combination of diffusion models successfully bridges AI-driven insights (e.g., dissolution predictions) with physical and structural drug properties (e.g., particle geometry from SEM images).
期刊介绍:
Pharmaceutical Research, an official journal of the American Association of Pharmaceutical Scientists, is committed to publishing novel research that is mechanism-based, hypothesis-driven and addresses significant issues in drug discovery, development and regulation. Current areas of interest include, but are not limited to:
-(pre)formulation engineering and processing-
computational biopharmaceutics-
drug delivery and targeting-
molecular biopharmaceutics and drug disposition (including cellular and molecular pharmacology)-
pharmacokinetics, pharmacodynamics and pharmacogenetics.
Research may involve nonclinical and clinical studies, and utilize both in vitro and in vivo approaches. Studies on small drug molecules, pharmaceutical solid materials (including biomaterials, polymers and nanoparticles) biotechnology products (including genes, peptides, proteins and vaccines), and genetically engineered cells are welcome.