Carlos A Padilla, Luis M Díaz-Sánchez, Cristian Blanco-Tirado, Aldo F Combariza, Marianny Y Combariza
{"title":"AI-Guided Design of MALDI Matrices: Exploring the Electron Transfer Chemical Space for Mass Spectrometric Analysis of Low-Molecular-Weight Compounds.","authors":"Carlos A Padilla, Luis M Díaz-Sánchez, Cristian Blanco-Tirado, Aldo F Combariza, Marianny Y Combariza","doi":"10.1021/jasms.4c00186","DOIUrl":null,"url":null,"abstract":"<p><p>The development of matrices for Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI MS) has traditionally relied on experimental efforts. Here, we propose a Goal-Directed artificial intelligence generative model, fueled by computational chemistry calculated data, to construct a chemical space optimized for Electron Transfer (ET) processes in MALDI analysis. We utilized a group of 30 reported ET matrices, subjected to structural enumeration and molecular properties prediction using semiempirical and <i>ab initio</i> calculations, to establish a comprehensive database comprising diverse structural and property data. Subsequently, employing a protocol of structural enumeration with 68 canonical SMILES of Bemis-Murcko (BM) fragments, we expanded the structural complexity of the initial library. This process generated 82753 compounds organized into 10 scaffold levels, with a p50 index from the Cyclic System Retrieval (CSR) curve of scaffolds of 50%. From the resulting enumerated library, a diverse subset of structures was selected by using the Jarvis-Patrick clustering method. These structures, along with their associated properties measured from quantum mechanics and experimental data, were used to train a Machine Learning (ML) model to predict ionization energy (<i>E</i><sub><i>i</i></sub>) values. Subsequently, a Scoring Neural Network (SNN), coupled with our Goal-Directed generative model using a Recurrent Neural Network (RNN) with Deep Learning (DL) architectures, was trained. The generative model was guided using a prior network within a Reinforcement/Transfer Learning environment. The final AI-generative model learned that structures with high unsaturation, H/C ratios under 1, and molecular weights between 100 and 300 u are favorable for ET MALDI matrices, as well as those with few aromatic rings and zero aliphatic rings. Other molecular features were also favored. The resulting AI-generated library exhibits <i>E</i><sub><i>i</i></sub> values over 8.0 eV, akin to those of reported \"good\" ET MALDI matrices, indicating successful design with high synthesis accessibility scores. In conclusion, our generative model provided valuable insights into the molecular features ideal for ET MALDI compounds while generating a wide range of structurally diverse molecules within a similar molecular property space. The next critical step in this process is to synthesize a selection of these generated compounds for the experimental validation and further characterization.</p>","PeriodicalId":672,"journal":{"name":"Journal of the American Society for Mass Spectrometry","volume":" ","pages":"2836-2848"},"PeriodicalIF":3.1000,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Society for Mass Spectrometry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/jasms.4c00186","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
The development of matrices for Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI MS) has traditionally relied on experimental efforts. Here, we propose a Goal-Directed artificial intelligence generative model, fueled by computational chemistry calculated data, to construct a chemical space optimized for Electron Transfer (ET) processes in MALDI analysis. We utilized a group of 30 reported ET matrices, subjected to structural enumeration and molecular properties prediction using semiempirical and ab initio calculations, to establish a comprehensive database comprising diverse structural and property data. Subsequently, employing a protocol of structural enumeration with 68 canonical SMILES of Bemis-Murcko (BM) fragments, we expanded the structural complexity of the initial library. This process generated 82753 compounds organized into 10 scaffold levels, with a p50 index from the Cyclic System Retrieval (CSR) curve of scaffolds of 50%. From the resulting enumerated library, a diverse subset of structures was selected by using the Jarvis-Patrick clustering method. These structures, along with their associated properties measured from quantum mechanics and experimental data, were used to train a Machine Learning (ML) model to predict ionization energy (Ei) values. Subsequently, a Scoring Neural Network (SNN), coupled with our Goal-Directed generative model using a Recurrent Neural Network (RNN) with Deep Learning (DL) architectures, was trained. The generative model was guided using a prior network within a Reinforcement/Transfer Learning environment. The final AI-generative model learned that structures with high unsaturation, H/C ratios under 1, and molecular weights between 100 and 300 u are favorable for ET MALDI matrices, as well as those with few aromatic rings and zero aliphatic rings. Other molecular features were also favored. The resulting AI-generated library exhibits Ei values over 8.0 eV, akin to those of reported "good" ET MALDI matrices, indicating successful design with high synthesis accessibility scores. In conclusion, our generative model provided valuable insights into the molecular features ideal for ET MALDI compounds while generating a wide range of structurally diverse molecules within a similar molecular property space. The next critical step in this process is to synthesize a selection of these generated compounds for the experimental validation and further characterization.
期刊介绍:
The Journal of the American Society for Mass Spectrometry presents research papers covering all aspects of mass spectrometry, incorporating coverage of fields of scientific inquiry in which mass spectrometry can play a role.
Comprehensive in scope, the journal publishes papers on both fundamentals and applications of mass spectrometry. Fundamental subjects include instrumentation principles, design, and demonstration, structures and chemical properties of gas-phase ions, studies of thermodynamic properties, ion spectroscopy, chemical kinetics, mechanisms of ionization, theories of ion fragmentation, cluster ions, and potential energy surfaces. In addition to full papers, the journal offers Communications, Application Notes, and Accounts and Perspectives