Rafał Stottko,Radosław Michalski,Bartłomiej M Szyja
{"title":"RGBChem:用于化学性质预测的类图像表示。","authors":"Rafał Stottko,Radosław Michalski,Bartłomiej M Szyja","doi":"10.1021/acs.jctc.5c00291","DOIUrl":null,"url":null,"abstract":"In this work, we introduce RGBChem, a novel approach for converting chemical compounds into image representations, which are subsequently used to train a convolutional neural network (CNN) to predict the HOMO-LUMO gap for compounds from the QM9 database. By modifying the arbitrary order of atoms present in .xyz files used to generate these images, it has been demonstrated that expanding the initial training set size can be achieved by creating multiple unique images (data points) from a single molecule. This study shows that the presented approach leads to a statistically significant improvement in model accuracy, highlighting RGBChem as a powerful approach for leveraging machine learning (ML) in scenarios where the available data set is too small to apply ML methods effectively.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":"125 1","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RGBChem: Image-Like Representation of Chemical Compounds for Property Prediction.\",\"authors\":\"Rafał Stottko,Radosław Michalski,Bartłomiej M Szyja\",\"doi\":\"10.1021/acs.jctc.5c00291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we introduce RGBChem, a novel approach for converting chemical compounds into image representations, which are subsequently used to train a convolutional neural network (CNN) to predict the HOMO-LUMO gap for compounds from the QM9 database. By modifying the arbitrary order of atoms present in .xyz files used to generate these images, it has been demonstrated that expanding the initial training set size can be achieved by creating multiple unique images (data points) from a single molecule. This study shows that the presented approach leads to a statistically significant improvement in model accuracy, highlighting RGBChem as a powerful approach for leveraging machine learning (ML) in scenarios where the available data set is too small to apply ML methods effectively.\",\"PeriodicalId\":45,\"journal\":{\"name\":\"Journal of Chemical Theory and Computation\",\"volume\":\"125 1\",\"pages\":\"\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Theory and Computation\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jctc.5c00291\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.5c00291","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
RGBChem: Image-Like Representation of Chemical Compounds for Property Prediction.
In this work, we introduce RGBChem, a novel approach for converting chemical compounds into image representations, which are subsequently used to train a convolutional neural network (CNN) to predict the HOMO-LUMO gap for compounds from the QM9 database. By modifying the arbitrary order of atoms present in .xyz files used to generate these images, it has been demonstrated that expanding the initial training set size can be achieved by creating multiple unique images (data points) from a single molecule. This study shows that the presented approach leads to a statistically significant improvement in model accuracy, highlighting RGBChem as a powerful approach for leveraging machine learning (ML) in scenarios where the available data set is too small to apply ML methods effectively.
期刊介绍:
The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.