Journal of Chemical Information and Modeling 最新文献

筛选
英文 中文
Comparative Study of Allosteric GPCR Binding Sites and Their Ligandability Potential 异位 GPCR 结合位点及其配体潜力的比较研究
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c0081910.1021/acs.jcim.4c00819
Sonja Peter, Lydia Siragusa, Morgan Thomas, Tommaso Palomba, Simon Cross, Noel M. O’Boyle, Dávid Bajusz, György G. Ferenczy, György M. Keserű, Giovanni Bottegoni, Brian Bender, Ijen Chen* and Chris De Graaf*, 
{"title":"Comparative Study of Allosteric GPCR Binding Sites and Their Ligandability Potential","authors":"Sonja Peter,&nbsp;Lydia Siragusa,&nbsp;Morgan Thomas,&nbsp;Tommaso Palomba,&nbsp;Simon Cross,&nbsp;Noel M. O’Boyle,&nbsp;Dávid Bajusz,&nbsp;György G. Ferenczy,&nbsp;György M. Keserű,&nbsp;Giovanni Bottegoni,&nbsp;Brian Bender,&nbsp;Ijen Chen* and Chris De Graaf*,&nbsp;","doi":"10.1021/acs.jcim.4c0081910.1021/acs.jcim.4c00819","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00819https://doi.org/10.1021/acs.jcim.4c00819","url":null,"abstract":"<p >The steadily growing number of experimental G-protein-coupled receptor (GPCR) structures has revealed diverse locations of allosteric modulation, and yet few drugs target them. This gap highlights the need for a deeper understanding of allosteric modulation in GPCR drug discovery. The current work introduces a systematic annotation scheme to structurally classify GPCR binding sites based on receptor class, transmembrane helix contacts, and, for membrane-facing sites, membrane sublocation. This GPCR specific annotation scheme was applied to 107 GPCR structures bound by small molecules contributing to 24 distinct allosteric binding sites for comparative evaluation of three binding site detection methods (BioGPS, SiteMap, and FTMap). BioGPS identified the most in 22 of 24 sites. In addition, our property analysis showed that extrahelical allosteric ligands and binding sites represent a distinct chemical space characterized by shallow pockets with low volume, and the corresponding allosteric ligands showed an enrichment of halogens. Furthermore, we demonstrated that combining receptor and ligand similarity can be a viable method for ligandability assessment. One challenge regarding site prediction is the ligand shaping effect on the observed binding site, especially for extrahelical sites where the ligand-induced effect was most pronounced. To our knowledge, this is the first study presenting a binding site annotation scheme standardized for GPCRs, and it allows a comparison of allosteric binding sites across different receptors in an objective way. The insight from this study provides a framework for future GPCR binding site studies and highlights the potential of targeting allosteric sites for drug development.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 21","pages":"8176–8192 8176–8192"},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c00819","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142608993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
molli: A General Purpose Python Toolkit for Combinatorial Small Molecule Library Generation, Manipulation, and Feature Extraction molli:用于组合式小分子化合物库生成、操作和特征提取的通用 Python 工具包
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c0042410.1021/acs.jcim.4c00424
Alexander S. Shved*, Blake E. Ocampo, Elena S. Burlova, Casey L. Olen, N. Ian Rinehart and Scott E. Denmark*, 
{"title":"molli: A General Purpose Python Toolkit for Combinatorial Small Molecule Library Generation, Manipulation, and Feature Extraction","authors":"Alexander S. Shved*,&nbsp;Blake E. Ocampo,&nbsp;Elena S. Burlova,&nbsp;Casey L. Olen,&nbsp;N. Ian Rinehart and Scott E. Denmark*,&nbsp;","doi":"10.1021/acs.jcim.4c0042410.1021/acs.jcim.4c00424","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00424https://doi.org/10.1021/acs.jcim.4c00424","url":null,"abstract":"<p >The construction, management, and analysis of large <i>in silico</i> molecular libraries is critical in many areas of modern chemistry. Herein, we introduce the MOLecular LIibrary toolkit, “molli”, which is a Python 3 cheminformatics module that provides a streamlined interface for manipulating large <i>in silico</i> libraries. Three-dimensional, combinatorial molecule libraries can be expanded directly from two-dimensional chemical structure fragments stored in CDXML files with high stereochemical fidelity. Geometry optimization, property calculation, and conformer generation are executed by interfacing with widely used computational chemistry programs such as OpenBabel, RDKit, ORCA, NWChem, and xTB/CREST. Conformer-dependent grid-based feature calculators provide numerical representation and interface to robust three-dimensional visualization tools that provide comprehensive images to enhance human understanding of libraries with thousands of members. The package includes a command-line interface in addition to Python classes to streamline frequently used workflows. Parallel performance is benchmarked on various hardware platforms, and common workflows are demonstrated for different tasks ranging from optimized grid-based descriptor calculation on catalyst libraries to an NMR chemical shift prediction workflow from CDXML files.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 21","pages":"8083–8090 8083–8090"},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142608987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction 基于深度学习的基本反应特性预测的不确定性鉴定
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c0135810.1021/acs.jcim.4c01358
Yan Liu, Yiming Mo* and Youwei Cheng*, 
{"title":"Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction","authors":"Yan Liu,&nbsp;Yiming Mo* and Youwei Cheng*,&nbsp;","doi":"10.1021/acs.jcim.4c0135810.1021/acs.jcim.4c01358","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01358https://doi.org/10.1021/acs.jcim.4c01358","url":null,"abstract":"<p >The prediction of the thermodynamic and kinetic properties of elementary reactions has shown rapid improvement due to the implementation of deep learning (DL) methods. While various studies have reported the success in predicting reaction properties, the quantification of prediction uncertainty has seldom been investigated, thus compromising the confidence in using these predicted properties in practical applications. Here, we integrated graph convolutional neural networks (GCNN) with three uncertainty prediction techniques, including deep ensemble, Monte Carlo (MC)-dropout, and evidential learning, to provide insights into the uncertainty quantification and utility. The deep ensemble model outperforms others in accuracy and shows the highest reliability in estimating prediction uncertainty across all elementary reaction property data sets. We also verified that the deep ensemble model showed a satisfactory capability in recognizing epistemic and aleatoric uncertainties. Additionally, we adopted a Monte Carlo Tree Search method for extracting the explainable reaction substructures, providing a chemical explanation for DL predicted properties and corresponding uncertainties. Finally, to demonstrate the utility of uncertainty qualification in practical applications, we performed an uncertainty-guided calibration of the DL-constructed kinetic model, which achieved a 25% higher hit ratio in identifying dominant reaction pathways compared to that of the calibration without uncertainty guidance.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 21","pages":"8131–8141 8131–8141"},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142608995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
molli: A General Purpose Python Toolkit for Combinatorial Small Molecule Library Generation, Manipulation, and Feature Extraction molli:用于组合式小分子化合物库生成、操作和特征提取的通用 Python 工具包
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c00424
Alexander S. Shved, Blake E. Ocampo, Elena S. Burlova, Casey L. Olen, N. Ian Rinehart, Scott E. Denmark
{"title":"molli: A General Purpose Python Toolkit for Combinatorial Small Molecule Library Generation, Manipulation, and Feature Extraction","authors":"Alexander S. Shved, Blake E. Ocampo, Elena S. Burlova, Casey L. Olen, N. Ian Rinehart, Scott E. Denmark","doi":"10.1021/acs.jcim.4c00424","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00424","url":null,"abstract":"The construction, management, and analysis of large <i>in silico</i> molecular libraries is critical in many areas of modern chemistry. Herein, we introduce the MOLecular LIibrary toolkit, “molli”, which is a Python 3 cheminformatics module that provides a streamlined interface for manipulating large <i>in silico</i> libraries. Three-dimensional, combinatorial molecule libraries can be expanded directly from two-dimensional chemical structure fragments stored in CDXML files with high stereochemical fidelity. Geometry optimization, property calculation, and conformer generation are executed by interfacing with widely used computational chemistry programs such as OpenBabel, RDKit, ORCA, NWChem, and xTB/CREST. Conformer-dependent grid-based feature calculators provide numerical representation and interface to robust three-dimensional visualization tools that provide comprehensive images to enhance human understanding of libraries with thousands of members. The package includes a command-line interface in addition to Python classes to streamline frequently used workflows. Parallel performance is benchmarked on various hardware platforms, and common workflows are demonstrated for different tasks ranging from optimized grid-based descriptor calculation on catalyst libraries to an NMR chemical shift prediction workflow from CDXML files.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"4 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142487986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Test of Molecules that Interfere with the Recognition Mechanisms between the SARS-CoV-2 Spike Protein and Its Host Cell Receptors 设计和测试干扰 SARS-CoV-2 穗状病毒蛋白与其宿主细胞受体之间识别机制的分子
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c0151110.1021/acs.jcim.4c01511
Francesca Scantamburlo, Ionica Masgras, Francesco Ciscato, Claudio Laquatra, Francesco Frigerio, Fabrizio Cinquini, Silvia Pavoni, Alice Triveri, Elena Frasnetti, Stefano A. Serapian, Giorgio Colombo*, Andrea Rasola* and Elisabetta Moroni*, 
{"title":"Design and Test of Molecules that Interfere with the Recognition Mechanisms between the SARS-CoV-2 Spike Protein and Its Host Cell Receptors","authors":"Francesca Scantamburlo,&nbsp;Ionica Masgras,&nbsp;Francesco Ciscato,&nbsp;Claudio Laquatra,&nbsp;Francesco Frigerio,&nbsp;Fabrizio Cinquini,&nbsp;Silvia Pavoni,&nbsp;Alice Triveri,&nbsp;Elena Frasnetti,&nbsp;Stefano A. Serapian,&nbsp;Giorgio Colombo*,&nbsp;Andrea Rasola* and Elisabetta Moroni*,&nbsp;","doi":"10.1021/acs.jcim.4c0151110.1021/acs.jcim.4c01511","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01511https://doi.org/10.1021/acs.jcim.4c01511","url":null,"abstract":"<p >The disruptive impact of the COVID-19 pandemic has led the scientific community to undertake an unprecedented effort to characterize viral infection mechanisms. Among these, interactions between the viral glycosylated Spike and the human receptors ACE2 and TMPRSS2 are key to allowing virus invasion. Here, we report and test a fully rational methodology to design molecules that are capable of perturbing the interactions between these critical players in SARS-CoV-2 pathogenicity. To this end, we computationally identify substructures on the fully glycosylated Spike protein that are not intramolecularly optimized and are thus prone to being stabilized by forming complexes with ACE2 and TMPRSS2. With the aim of competing with the Spike-mediated cell entry mechanisms, we have engineered the predicted putative interaction regions in the form of peptide mimics that could compete with Spike for interaction with ACE2 and/or TMPRSS2. Experimental models of viral entry demonstrate that the designed molecules are able to interfere with viral entry into ACE2/TMPRSS2 expressing cells, while they have no effects on the entry of control viral particles that do not harbor the Spike protein or on the entry of Spike-presenting viral particles into cells that do not display its receptors on their surface.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 21","pages":"8274–8282 8274–8282"},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142608986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Test of Molecules that Interfere with the Recognition Mechanisms between the SARS-CoV-2 Spike Protein and Its Host Cell Receptors 设计和测试干扰 SARS-CoV-2 穗状病毒蛋白与其宿主细胞受体之间识别机制的分子
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c01511
Francesca Scantamburlo, Ionica Masgras, Francesco Ciscato, Claudio Laquatra, Francesco Frigerio, Fabrizio Cinquini, Silvia Pavoni, Alice Triveri, Elena Frasnetti, Stefano A. Serapian, Giorgio Colombo, Andrea Rasola, Elisabetta Moroni
{"title":"Design and Test of Molecules that Interfere with the Recognition Mechanisms between the SARS-CoV-2 Spike Protein and Its Host Cell Receptors","authors":"Francesca Scantamburlo, Ionica Masgras, Francesco Ciscato, Claudio Laquatra, Francesco Frigerio, Fabrizio Cinquini, Silvia Pavoni, Alice Triveri, Elena Frasnetti, Stefano A. Serapian, Giorgio Colombo, Andrea Rasola, Elisabetta Moroni","doi":"10.1021/acs.jcim.4c01511","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01511","url":null,"abstract":"The disruptive impact of the COVID-19 pandemic has led the scientific community to undertake an unprecedented effort to characterize viral infection mechanisms. Among these, interactions between the viral glycosylated Spike and the human receptors ACE2 and TMPRSS2 are key to allowing virus invasion. Here, we report and test a fully rational methodology to design molecules that are capable of perturbing the interactions between these critical players in SARS-CoV-2 pathogenicity. To this end, we computationally identify substructures on the fully glycosylated Spike protein that are not intramolecularly optimized and are thus prone to being stabilized by forming complexes with ACE2 and TMPRSS2. With the aim of competing with the Spike-mediated cell entry mechanisms, we have engineered the predicted putative interaction regions in the form of peptide mimics that could compete with Spike for interaction with ACE2 and/or TMPRSS2. Experimental models of viral entry demonstrate that the designed molecules are able to interfere with viral entry into ACE2/TMPRSS2 expressing cells, while they have no effects on the entry of control viral particles that do not harbor the Spike protein or on the entry of Spike-presenting viral particles into cells that do not display its receptors on their surface.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142488006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparative Study of Allosteric GPCR Binding Sites and Their Ligandability Potential 异位 GPCR 结合位点及其配体潜力的比较研究
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c00819
Sonja Peter, Lydia Siragusa, Morgan Thomas, Tommaso Palomba, Simon Cross, Noel M. O’Boyle, Dávid Bajusz, György G. Ferenczy, György M. Keserű, Giovanni Bottegoni, Brian Bender, Ijen Chen, Chris De Graaf
{"title":"Comparative Study of Allosteric GPCR Binding Sites and Their Ligandability Potential","authors":"Sonja Peter, Lydia Siragusa, Morgan Thomas, Tommaso Palomba, Simon Cross, Noel M. O’Boyle, Dávid Bajusz, György G. Ferenczy, György M. Keserű, Giovanni Bottegoni, Brian Bender, Ijen Chen, Chris De Graaf","doi":"10.1021/acs.jcim.4c00819","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c00819","url":null,"abstract":"The steadily growing number of experimental G-protein-coupled receptor (GPCR) structures has revealed diverse locations of allosteric modulation, and yet few drugs target them. This gap highlights the need for a deeper understanding of allosteric modulation in GPCR drug discovery. The current work introduces a systematic annotation scheme to structurally classify GPCR binding sites based on receptor class, transmembrane helix contacts, and, for membrane-facing sites, membrane sublocation. This GPCR specific annotation scheme was applied to 107 GPCR structures bound by small molecules contributing to 24 distinct allosteric binding sites for comparative evaluation of three binding site detection methods (BioGPS, SiteMap, and FTMap). BioGPS identified the most in 22 of 24 sites. In addition, our property analysis showed that extrahelical allosteric ligands and binding sites represent a distinct chemical space characterized by shallow pockets with low volume, and the corresponding allosteric ligands showed an enrichment of halogens. Furthermore, we demonstrated that combining receptor and ligand similarity can be a viable method for ligandability assessment. One challenge regarding site prediction is the ligand shaping effect on the observed binding site, especially for extrahelical sites where the ligand-induced effect was most pronounced. To our knowledge, this is the first study presenting a binding site annotation scheme standardized for GPCRs, and it allows a comparison of allosteric binding sites across different receptors in an objective way. The insight from this study provides a framework for future GPCR binding site studies and highlights the potential of targeting allosteric sites for drug development.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"17 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142487993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines 机器学习驱动数据评估,优化高通量筛选流水线
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c01547
Joshua Hesse, Davide Boldini, Stephan A. Sieber
{"title":"Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines","authors":"Joshua Hesse, Davide Boldini, Stephan A. Sieber","doi":"10.1021/acs.jcim.4c01547","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01547","url":null,"abstract":"In the rapidly evolving field of drug discovery, high-throughput screening (HTS) is essential for identifying bioactive compounds. This study introduces a novel application of data valuation, a concept for evaluating the importance of data points based on their impact, to enhance drug discovery pipelines. Our approach improves active learning for compound library screening, robustly identifies true and false positives in HTS data, and identifies important inactive samples in an imbalanced HTS training, all while accounting for computational efficiency. We demonstrate that importance-based methods enable more effective batch screening, reducing the need for extensive HTS. Machine learning models accurately differentiate true biological activity from assay artifacts, streamlining the drug discovery process. Additionally, importance undersampling aids in HTS data set balancing, improving machine learning performance without omitting crucial inactive samples. These advancements could significantly enhance the efficiency and accuracy of drug development.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"24 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142487987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction 基于深度学习的基本反应特性预测的不确定性鉴定
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c01358
Yan Liu, Yiming Mo, Youwei Cheng
{"title":"Uncertainty Qualification for Deep Learning-Based Elementary Reaction Property Prediction","authors":"Yan Liu, Yiming Mo, Youwei Cheng","doi":"10.1021/acs.jcim.4c01358","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01358","url":null,"abstract":"The prediction of the thermodynamic and kinetic properties of elementary reactions has shown rapid improvement due to the implementation of deep learning (DL) methods. While various studies have reported the success in predicting reaction properties, the quantification of prediction uncertainty has seldom been investigated, thus compromising the confidence in using these predicted properties in practical applications. Here, we integrated graph convolutional neural networks (GCNN) with three uncertainty prediction techniques, including deep ensemble, Monte Carlo (MC)-dropout, and evidential learning, to provide insights into the uncertainty quantification and utility. The deep ensemble model outperforms others in accuracy and shows the highest reliability in estimating prediction uncertainty across all elementary reaction property data sets. We also verified that the deep ensemble model showed a satisfactory capability in recognizing epistemic and aleatoric uncertainties. Additionally, we adopted a Monte Carlo Tree Search method for extracting the explainable reaction substructures, providing a chemical explanation for DL predicted properties and corresponding uncertainties. Finally, to demonstrate the utility of uncertainty qualification in practical applications, we performed an uncertainty-guided calibration of the DL-constructed kinetic model, which achieved a 25% higher hit ratio in identifying dominant reaction pathways compared to that of the calibration without uncertainty guidance.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"25 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142487997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines 机器学习驱动数据评估,优化高通量筛选流水线
IF 5.6 2区 化学
Journal of Chemical Information and Modeling Pub Date : 2024-10-23 DOI: 10.1021/acs.jcim.4c0154710.1021/acs.jcim.4c01547
Joshua Hesse, Davide Boldini* and Stephan A. Sieber*, 
{"title":"Machine Learning-Driven Data Valuation for Optimizing High-Throughput Screening Pipelines","authors":"Joshua Hesse,&nbsp;Davide Boldini* and Stephan A. Sieber*,&nbsp;","doi":"10.1021/acs.jcim.4c0154710.1021/acs.jcim.4c01547","DOIUrl":"https://doi.org/10.1021/acs.jcim.4c01547https://doi.org/10.1021/acs.jcim.4c01547","url":null,"abstract":"<p >In the rapidly evolving field of drug discovery, high-throughput screening (HTS) is essential for identifying bioactive compounds. This study introduces a novel application of data valuation, a concept for evaluating the importance of data points based on their impact, to enhance drug discovery pipelines. Our approach improves active learning for compound library screening, robustly identifies true and false positives in HTS data, and identifies important inactive samples in an imbalanced HTS training, all while accounting for computational efficiency. We demonstrate that importance-based methods enable more effective batch screening, reducing the need for extensive HTS. Machine learning models accurately differentiate true biological activity from assay artifacts, streamlining the drug discovery process. Additionally, importance undersampling aids in HTS data set balancing, improving machine learning performance without omitting crucial inactive samples. These advancements could significantly enhance the efficiency and accuracy of drug development.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"64 21","pages":"8142–8152 8142–8152"},"PeriodicalIF":5.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/epdf/10.1021/acs.jcim.4c01547","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142608994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信