Javier Camarillo-Cisneros, Graciela Ramirez-Alonso, Carlos Arzate-Quintana, Hugo Varela-Rodríguez, Abimael Guzman-Pando
{"title":"MolGC: molecular geometry comparator algorithm for bond length mean absolute error computation on molecules","authors":"Javier Camarillo-Cisneros, Graciela Ramirez-Alonso, Carlos Arzate-Quintana, Hugo Varela-Rodríguez, Abimael Guzman-Pando","doi":"10.1007/s11030-024-10945-2","DOIUrl":null,"url":null,"abstract":"<div><p>Density Functional Theory (DFT) is extensively used in theoretical and computational chemistry to study molecular and crystal properties across diverse fields, including quantum chemistry, materials physics, catalysis, biochemistry, and surface science. Despite advances in DFT hardware and software for optimized geometries, achieving consensus in molecular structure comparisons with experimental counterparts remains a challenge. This difficulty is exacerbated by the lack of automated bond length comparison tools, resulting in labor-intensive and error-prone manual processes. To address these challenges, we propose MolGC, a Molecular Geometry Comparator algorithm that automates the comparison of optimized geometries from different theoretical levels. MolGC calculates the mean absolute error (MAE) of bond lengths by integrating data from various DFT software. It provides interactive and customizable visualization of geometries, enabling users to explore different views for enhanced analysis. In addition, it saves MAE computations for further analysis and offers a comprehensive statistical summary of the results. MolGC effectively addresses complex graph labeling challenges, ensuring accurate identification and categorization of bonds in diverse chemical structures. It achieves a 98.91% average rate in correct bond label assignments on an antibiotics dataset, showcasing its effectiveness for comparing molecular bond lengths across geometries of varying complexity and size. The executable file and software resources for running MolGC can be downloaded from https://github.com/AbimaelGP/MolGC/tree/main.\n</p></div>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":"28 4","pages":"1925 - 1945"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1007/s11030-024-10945-2","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
Density Functional Theory (DFT) is extensively used in theoretical and computational chemistry to study molecular and crystal properties across diverse fields, including quantum chemistry, materials physics, catalysis, biochemistry, and surface science. Despite advances in DFT hardware and software for optimized geometries, achieving consensus in molecular structure comparisons with experimental counterparts remains a challenge. This difficulty is exacerbated by the lack of automated bond length comparison tools, resulting in labor-intensive and error-prone manual processes. To address these challenges, we propose MolGC, a Molecular Geometry Comparator algorithm that automates the comparison of optimized geometries from different theoretical levels. MolGC calculates the mean absolute error (MAE) of bond lengths by integrating data from various DFT software. It provides interactive and customizable visualization of geometries, enabling users to explore different views for enhanced analysis. In addition, it saves MAE computations for further analysis and offers a comprehensive statistical summary of the results. MolGC effectively addresses complex graph labeling challenges, ensuring accurate identification and categorization of bonds in diverse chemical structures. It achieves a 98.91% average rate in correct bond label assignments on an antibiotics dataset, showcasing its effectiveness for comparing molecular bond lengths across geometries of varying complexity and size. The executable file and software resources for running MolGC can be downloaded from https://github.com/AbimaelGP/MolGC/tree/main.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;