{"title":"使用量子化学和结构描述符的哈米特常数的机器学习驱动预测","authors":"Vaneet Saini, Ranjeet Kumar","doi":"10.1039/d5cp01184a","DOIUrl":null,"url":null,"abstract":"Understanding and predicting chemical reaction behavior is a fundamental challenge in chemistry. The Hammett equation, introduced in 1935, has been a cornerstone in modelling structure-activity relationships, particularly in physical organic chemistry. This study leverages machine learning (ML) to predict Hammett constants (σm and σp) for a diverse set of benzoic acid derivatives. We developed an open-source dataset of over 900 molecules, including meta-, para-, and symmetrically substituted variants, and employed various ML models to predict Hammett constants. Quantum chemical descriptors, combined with Mordred-based electronic, steric, and topological descriptors, were used to train models such as Extra Trees (ET) and Artificial Neural Networks (ANNs). The ANN model achieved the highest accuracy, with a test R2 of 0.935 and an RMSE of 0.084, outperforming other models and a previously developed graph neural networks. Feature importance analysis revealed key descriptors, including NBO charges and HOMO energies, driving the predictions. Applicability domain (AD) analysis identified outliers and compounds outside the AD, ensuring model reliability. This work highlights the potential of ML in predicting Hammett constants, offering a robust tool for chemical reactivity analysis and molecular design.","PeriodicalId":99,"journal":{"name":"Physical Chemistry Chemical Physics","volume":"10 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A machine learning-driven prediction of Hammett constants using quantum chemical and structural descriptors\",\"authors\":\"Vaneet Saini, Ranjeet Kumar\",\"doi\":\"10.1039/d5cp01184a\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding and predicting chemical reaction behavior is a fundamental challenge in chemistry. The Hammett equation, introduced in 1935, has been a cornerstone in modelling structure-activity relationships, particularly in physical organic chemistry. This study leverages machine learning (ML) to predict Hammett constants (σm and σp) for a diverse set of benzoic acid derivatives. We developed an open-source dataset of over 900 molecules, including meta-, para-, and symmetrically substituted variants, and employed various ML models to predict Hammett constants. Quantum chemical descriptors, combined with Mordred-based electronic, steric, and topological descriptors, were used to train models such as Extra Trees (ET) and Artificial Neural Networks (ANNs). The ANN model achieved the highest accuracy, with a test R2 of 0.935 and an RMSE of 0.084, outperforming other models and a previously developed graph neural networks. Feature importance analysis revealed key descriptors, including NBO charges and HOMO energies, driving the predictions. Applicability domain (AD) analysis identified outliers and compounds outside the AD, ensuring model reliability. This work highlights the potential of ML in predicting Hammett constants, offering a robust tool for chemical reactivity analysis and molecular design.\",\"PeriodicalId\":99,\"journal\":{\"name\":\"Physical Chemistry Chemical Physics\",\"volume\":\"10 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Physical Chemistry Chemical Physics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1039/d5cp01184a\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical Chemistry Chemical Physics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d5cp01184a","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
A machine learning-driven prediction of Hammett constants using quantum chemical and structural descriptors
Understanding and predicting chemical reaction behavior is a fundamental challenge in chemistry. The Hammett equation, introduced in 1935, has been a cornerstone in modelling structure-activity relationships, particularly in physical organic chemistry. This study leverages machine learning (ML) to predict Hammett constants (σm and σp) for a diverse set of benzoic acid derivatives. We developed an open-source dataset of over 900 molecules, including meta-, para-, and symmetrically substituted variants, and employed various ML models to predict Hammett constants. Quantum chemical descriptors, combined with Mordred-based electronic, steric, and topological descriptors, were used to train models such as Extra Trees (ET) and Artificial Neural Networks (ANNs). The ANN model achieved the highest accuracy, with a test R2 of 0.935 and an RMSE of 0.084, outperforming other models and a previously developed graph neural networks. Feature importance analysis revealed key descriptors, including NBO charges and HOMO energies, driving the predictions. Applicability domain (AD) analysis identified outliers and compounds outside the AD, ensuring model reliability. This work highlights the potential of ML in predicting Hammett constants, offering a robust tool for chemical reactivity analysis and molecular design.
期刊介绍:
Physical Chemistry Chemical Physics (PCCP) is an international journal co-owned by 19 physical chemistry and physics societies from around the world. This journal publishes original, cutting-edge research in physical chemistry, chemical physics and biophysical chemistry. To be suitable for publication in PCCP, articles must include significant innovation and/or insight into physical chemistry; this is the most important criterion that reviewers and Editors will judge against when evaluating submissions.
The journal has a broad scope and welcomes contributions spanning experiment, theory, computation and data science. Topical coverage includes spectroscopy, dynamics, kinetics, statistical mechanics, thermodynamics, electrochemistry, catalysis, surface science, quantum mechanics, quantum computing and machine learning. Interdisciplinary research areas such as polymers and soft matter, materials, nanoscience, energy, surfaces/interfaces, and biophysical chemistry are welcomed if they demonstrate significant innovation and/or insight into physical chemistry. Joined experimental/theoretical studies are particularly appreciated when complementary and based on up-to-date approaches.