Luigi Da Vià, Matthias Depoortere, Robert D. Willacy, Alastair J. Roberts, Pandian Sokkar, Mathieu Fossépré, Andrew Ruba, Magdalena A. Zwierzyna
{"title":"Enabling Data-Driven Solubility Modeling at GSK: Enhancing Purge Predictions for Mutagenic Impurities","authors":"Luigi Da Vià, Matthias Depoortere, Robert D. Willacy, Alastair J. Roberts, Pandian Sokkar, Mathieu Fossépré, Andrew Ruba, Magdalena A. Zwierzyna","doi":"10.1021/acs.oprd.4c00384","DOIUrl":null,"url":null,"abstract":"In the pharmaceutical industry, solubility is a critical parameter influencing various stages of drug development, from early discovery to commercial manufacturing. This work showcases a high-throughput solubility screening workflow and describes the steps required to standardize and curate data suitably to allow automated data flow. Using the high-quality data, we developed a quantitative structure–property relationship model using gradient boosting and molecular descriptors, requiring only a 2D molecular structure to generate predictions. The accuracy of the model is competitive with alternative approaches where additional physical data is not required. A key use case for solubility predictions made in this way is in developing control strategies for mutagenic impurities, allowing for a data-driven and consistent method for calculating the solubility contribution to purge calculations. Further perspective is given on the future of the application of the model as a solubility prediction algorithm and on the approach to data-driven methodologies supporting drug development in general, highlighting the potential for federated learning approaches which use technological approaches to overcome the barrier to cross-industry data sharing.","PeriodicalId":55,"journal":{"name":"Organic Process Research & Development","volume":"13 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Organic Process Research & Development","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.oprd.4c00384","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
In the pharmaceutical industry, solubility is a critical parameter influencing various stages of drug development, from early discovery to commercial manufacturing. This work showcases a high-throughput solubility screening workflow and describes the steps required to standardize and curate data suitably to allow automated data flow. Using the high-quality data, we developed a quantitative structure–property relationship model using gradient boosting and molecular descriptors, requiring only a 2D molecular structure to generate predictions. The accuracy of the model is competitive with alternative approaches where additional physical data is not required. A key use case for solubility predictions made in this way is in developing control strategies for mutagenic impurities, allowing for a data-driven and consistent method for calculating the solubility contribution to purge calculations. Further perspective is given on the future of the application of the model as a solubility prediction algorithm and on the approach to data-driven methodologies supporting drug development in general, highlighting the potential for federated learning approaches which use technological approaches to overcome the barrier to cross-industry data sharing.
期刊介绍:
The journal Organic Process Research & Development serves as a communication tool between industrial chemists and chemists working in universities and research institutes. As such, it reports original work from the broad field of industrial process chemistry but also presents academic results that are relevant, or potentially relevant, to industrial applications. Process chemistry is the science that enables the safe, environmentally benign and ultimately economical manufacturing of organic compounds that are required in larger amounts to help address the needs of society. Consequently, the Journal encompasses every aspect of organic chemistry, including all aspects of catalysis, synthetic methodology development and synthetic strategy exploration, but also includes aspects from analytical and solid-state chemistry and chemical engineering, such as work-up tools,process safety, or flow-chemistry. The goal of development and optimization of chemical reactions and processes is their transfer to a larger scale; original work describing such studies and the actual implementation on scale is highly relevant to the journal. However, studies on new developments from either industry, research institutes or academia that have not yet been demonstrated on scale, but where an industrial utility can be expected and where the study has addressed important prerequisites for a scale-up and has given confidence into the reliability and practicality of the chemistry, also serve the mission of OPR&D as a communication tool between the different contributors to the field.