AI In Action: Redefining Drug Discovery and Development

IF 3.1 3区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL
Anshul Kanakia, Mark Sale, Liang Zhao, Zhu Zhou
{"title":"AI In Action: Redefining Drug Discovery and Development","authors":"Anshul Kanakia,&nbsp;Mark Sale,&nbsp;Liang Zhao,&nbsp;Zhu Zhou","doi":"10.1111/cts.70149","DOIUrl":null,"url":null,"abstract":"<p>AI has revolutionized the drug discovery space in recent years, with applications ranging from highly accurate structure predictions of proteins [<span>1</span>], to the design and optimization of both small and large molecules [<span>2</span>]. Several large foundational models have been developed for encoding functional information of proteins in a powerful way to support the drug development pipeline [<span>3, 4</span>]. Figure 1 highlights the areas in the pipeline where AI now plays a significant role and is poised to disrupt traditional experimental techniques. The culmination of AI-driven discovery is de novo design, where the entire preclinical pipeline can be performed in silico, resulting in billions of dollars of R&amp;D cost savings, translating to reduced costs of medications and higher clinical success rates via optimization of safer and more developable molecules showing strong efficacy for well-selected targets.</p><p>While de novo design is as-yet unproven, the success rate of the 21 AI-developed drugs that have completed Phase I trials as of December 2023 is 80%–90%, significantly higher than ~40% for traditional methods [<span>5</span>]. We continue to see an increase in the number of candidate drugs developed using AI enter clinical stages, and this trend is growing at an exponential rate—from 3 in 2016 to 17 in 2020 and 67 in 2023 [<span>5</span>].</p><p>The intersection between high-quality data access across life science modalities like imaging, multi-omics, DMRs, and very large protein repertoires, and recent advancements in the scaling and architecture of large deep learning models has led to an explosion in AI applications for healthcare. While some of this data is publicly available, much of it is proprietary and under the control of large pharmaceutical companies, partly due to regulatory and privacy concerns. Conversely, innovation in AI for drug discovery is being led by academic and industry research laboratories, often resulting in highly funded spin-off ventures like Genentech, Recursion, Absci, and more recently, Evolutionary Scale. Such AI-first life sciences companies have found success in synergistic partnerships with large pharmaceutical companies, thereby gaining access to the large proprietary datasets upon which to apply their AI expertise. Some of these partnerships have led to acquisitions such as the 2009 purchase of Genentech by Roche for approximately $46.8 billion, highlighting the value that AI internalization brings to large pharmaceutical companies.</p><p>The use of AI is poised to cover the full life cycle of a drug product, including drug discovery, drug development, and application assessment in a regulatory setting. Recent research from the Food and Drug Administration (FDA) included two distinct case studies. The first case exemplifies the use of conventional machine learning (ML) approaches through a project aimed at decoding kinase–adverse event associations for small molecule kinase inhibitors (SMKIs). By constructing a multi-domain dataset from 4638 patients in registrational trials of 16 FDA-approved SMKIs, ML models such as Random Survival Forests (RSF), Artificial Neural Networks (ANNs), and DeepHit were utilized to find potential associations between 442 kinases and 2145 adverse events. This information was made publicly accessible via an interactive web application, “Identification of Kinase-Specific Signal” (https://gongj.shinyapps.io/ml4ki). This platform aids experimentalists in identifying and verifying kinase-inhibitor adverse event pairs and serves as a precision-medicine tool to mitigate individual patient safety risks by forecasting clinical safety signals [<span>6</span>]. In general, the credibility of AI models in extrapolation and generalization heavily depends on the diversity and comprehensiveness of the training data. Future studies integrating richer datasets with detailed genomic, phenotypic, and demographic information could further improve the precision of such associations and help refine the applicability of these models to specific patient subgroups. For future research, while Multi-Input Neural Networks were not employed in this study, they represent a promising architecture for integrating heterogeneous datasets, such as kinase activity, demographic data, and clinical outcomes, into a unified predictive framework. Additionally, hybrid approaches combining neural networks with Markov Chains could be explored to capture sequential dependencies in disease progression and improve the robustness of predictions across diverse patient cohorts.</p><p>The second case study showcases the application of generative AI methods through the development of PharmBERT, a domain-specific large language model (LLM) for drug labels [<span>7</span>]. Leveraging the foundational BERT architecture, PharmBERT was pre-trained on textual data extracted from 138,924 raw drug labels sourced from DailyMed. This pre-training on domain-specific text significantly improved the model's performance in extracting pharmacokinetic information from drug labeling. PharmBERT demonstrated superior performance in tasks such as adverse drug reaction (ADR) detection and ADME (absorption, distribution, metabolism, and excretion) classification, surpassing other models like ClinicalBERT and BioBERT. This advancement underscores the potential of LLMs to enhance the efficiency of text-related regulatory work and improve the extraction of critical information from complex drug labels.</p><p>Together, these case studies illustrate the transformative impact of AI on drug development and regulatory science. Traditional AI methods provide robust frameworks for specific, structured data analyses, while generative AI methods offer expansive capabilities for handling unstructured data and developing generalized intelligence. Both approaches are crucial for advancing personalized medicine and optimizing drug development processes.</p><p>Figure 2 summarizes the results from two surveys during the “When AI Meets Drug Development” session at the 2024 American Society of Clinical Pharmacology and Therapeutics Annual Meeting. The first question evaluates views on AI's potential as a significant change in drug R&amp;D. Notably, 80% of participants recognized AI's significant impact, while 12% were unconvinced. No participants were unaware of AI's application in drug R&amp;D, suggesting a high level of awareness within the clinical pharmacology community. A small minority (6%) were uncertain about AI's current capabilities, and 2% selected an unspecified option. Regarding AI's future impact in the next 5–10 years, 45% highlighted a preference for its application in molecule design and optimization, followed by clinical trials and development (28%), target discovery and validation (20%), and preclinical testing and screening (7%). The results highlight the current familiarity, usage, and perceptions of AI among clinical pharmacology community, indicating a strong interest and optimism about AI's role in the future of drug development.</p><p>Looking ahead, the integration of AI in drug R&amp;D is poised to accelerate, driven by advancements from leading tech companies. NVIDIA's powerful GPUs and AI frameworks are enabling faster and more efficient generative drug discovery processes. Google Health is leveraging its expertise in data analytics and ML to enhance predictive modeling and patient data analysis. Apple Health is contributing through its health data ecosystem, facilitating personalized medicine and real-time health monitoring. OpenAI's cutting-edge language models are revolutionizing the way researchers generate hypotheses and analyze scientific literature. These innovations collectively promise to streamline the drug development pipeline, reduce costs, and improve clinical outcomes, heralding a new era of precision medicine.</p><p>As global investment in AI for drug discovery accelerates, so does the expectation of improved outcomes for drug programs. As of 2024, there are no on-market medications that have been developed using an AI-first pipeline. Future drivers for AI, particularly in healthcare, need to show disruption to existing business processes and tangible financial gains. This could happen via the launch of the first AI-developed medication or AI-based clinical pipeline improvements that significantly reduce the lead time from first patient in to regulatory approval.</p><p>M.S. is an employee of Certara. A.K. is an employee of AstraZeneca. All other authors declared no competing interests for this work.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"18 2","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70149","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cts.70149","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

AI has revolutionized the drug discovery space in recent years, with applications ranging from highly accurate structure predictions of proteins [1], to the design and optimization of both small and large molecules [2]. Several large foundational models have been developed for encoding functional information of proteins in a powerful way to support the drug development pipeline [3, 4]. Figure 1 highlights the areas in the pipeline where AI now plays a significant role and is poised to disrupt traditional experimental techniques. The culmination of AI-driven discovery is de novo design, where the entire preclinical pipeline can be performed in silico, resulting in billions of dollars of R&D cost savings, translating to reduced costs of medications and higher clinical success rates via optimization of safer and more developable molecules showing strong efficacy for well-selected targets.

While de novo design is as-yet unproven, the success rate of the 21 AI-developed drugs that have completed Phase I trials as of December 2023 is 80%–90%, significantly higher than ~40% for traditional methods [5]. We continue to see an increase in the number of candidate drugs developed using AI enter clinical stages, and this trend is growing at an exponential rate—from 3 in 2016 to 17 in 2020 and 67 in 2023 [5].

The intersection between high-quality data access across life science modalities like imaging, multi-omics, DMRs, and very large protein repertoires, and recent advancements in the scaling and architecture of large deep learning models has led to an explosion in AI applications for healthcare. While some of this data is publicly available, much of it is proprietary and under the control of large pharmaceutical companies, partly due to regulatory and privacy concerns. Conversely, innovation in AI for drug discovery is being led by academic and industry research laboratories, often resulting in highly funded spin-off ventures like Genentech, Recursion, Absci, and more recently, Evolutionary Scale. Such AI-first life sciences companies have found success in synergistic partnerships with large pharmaceutical companies, thereby gaining access to the large proprietary datasets upon which to apply their AI expertise. Some of these partnerships have led to acquisitions such as the 2009 purchase of Genentech by Roche for approximately $46.8 billion, highlighting the value that AI internalization brings to large pharmaceutical companies.

The use of AI is poised to cover the full life cycle of a drug product, including drug discovery, drug development, and application assessment in a regulatory setting. Recent research from the Food and Drug Administration (FDA) included two distinct case studies. The first case exemplifies the use of conventional machine learning (ML) approaches through a project aimed at decoding kinase–adverse event associations for small molecule kinase inhibitors (SMKIs). By constructing a multi-domain dataset from 4638 patients in registrational trials of 16 FDA-approved SMKIs, ML models such as Random Survival Forests (RSF), Artificial Neural Networks (ANNs), and DeepHit were utilized to find potential associations between 442 kinases and 2145 adverse events. This information was made publicly accessible via an interactive web application, “Identification of Kinase-Specific Signal” (https://gongj.shinyapps.io/ml4ki). This platform aids experimentalists in identifying and verifying kinase-inhibitor adverse event pairs and serves as a precision-medicine tool to mitigate individual patient safety risks by forecasting clinical safety signals [6]. In general, the credibility of AI models in extrapolation and generalization heavily depends on the diversity and comprehensiveness of the training data. Future studies integrating richer datasets with detailed genomic, phenotypic, and demographic information could further improve the precision of such associations and help refine the applicability of these models to specific patient subgroups. For future research, while Multi-Input Neural Networks were not employed in this study, they represent a promising architecture for integrating heterogeneous datasets, such as kinase activity, demographic data, and clinical outcomes, into a unified predictive framework. Additionally, hybrid approaches combining neural networks with Markov Chains could be explored to capture sequential dependencies in disease progression and improve the robustness of predictions across diverse patient cohorts.

The second case study showcases the application of generative AI methods through the development of PharmBERT, a domain-specific large language model (LLM) for drug labels [7]. Leveraging the foundational BERT architecture, PharmBERT was pre-trained on textual data extracted from 138,924 raw drug labels sourced from DailyMed. This pre-training on domain-specific text significantly improved the model's performance in extracting pharmacokinetic information from drug labeling. PharmBERT demonstrated superior performance in tasks such as adverse drug reaction (ADR) detection and ADME (absorption, distribution, metabolism, and excretion) classification, surpassing other models like ClinicalBERT and BioBERT. This advancement underscores the potential of LLMs to enhance the efficiency of text-related regulatory work and improve the extraction of critical information from complex drug labels.

Together, these case studies illustrate the transformative impact of AI on drug development and regulatory science. Traditional AI methods provide robust frameworks for specific, structured data analyses, while generative AI methods offer expansive capabilities for handling unstructured data and developing generalized intelligence. Both approaches are crucial for advancing personalized medicine and optimizing drug development processes.

Figure 2 summarizes the results from two surveys during the “When AI Meets Drug Development” session at the 2024 American Society of Clinical Pharmacology and Therapeutics Annual Meeting. The first question evaluates views on AI's potential as a significant change in drug R&D. Notably, 80% of participants recognized AI's significant impact, while 12% were unconvinced. No participants were unaware of AI's application in drug R&D, suggesting a high level of awareness within the clinical pharmacology community. A small minority (6%) were uncertain about AI's current capabilities, and 2% selected an unspecified option. Regarding AI's future impact in the next 5–10 years, 45% highlighted a preference for its application in molecule design and optimization, followed by clinical trials and development (28%), target discovery and validation (20%), and preclinical testing and screening (7%). The results highlight the current familiarity, usage, and perceptions of AI among clinical pharmacology community, indicating a strong interest and optimism about AI's role in the future of drug development.

Looking ahead, the integration of AI in drug R&D is poised to accelerate, driven by advancements from leading tech companies. NVIDIA's powerful GPUs and AI frameworks are enabling faster and more efficient generative drug discovery processes. Google Health is leveraging its expertise in data analytics and ML to enhance predictive modeling and patient data analysis. Apple Health is contributing through its health data ecosystem, facilitating personalized medicine and real-time health monitoring. OpenAI's cutting-edge language models are revolutionizing the way researchers generate hypotheses and analyze scientific literature. These innovations collectively promise to streamline the drug development pipeline, reduce costs, and improve clinical outcomes, heralding a new era of precision medicine.

As global investment in AI for drug discovery accelerates, so does the expectation of improved outcomes for drug programs. As of 2024, there are no on-market medications that have been developed using an AI-first pipeline. Future drivers for AI, particularly in healthcare, need to show disruption to existing business processes and tangible financial gains. This could happen via the launch of the first AI-developed medication or AI-based clinical pipeline improvements that significantly reduce the lead time from first patient in to regulatory approval.

M.S. is an employee of Certara. A.K. is an employee of AstraZeneca. All other authors declared no competing interests for this work.

Abstract Image

求助全文
约1分钟内获得全文 求助全文
来源期刊
Cts-Clinical and Translational Science
Cts-Clinical and Translational Science 医学-医学:研究与实验
CiteScore
6.70
自引率
2.60%
发文量
234
审稿时长
6-12 weeks
期刊介绍: Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信