Shang Zhu, Bichlien H. Nguyen, Yingce Xia, Kali Frost, Shufang Xie, Venkatasubramanian Viswanathan and Jake A. Smith
{"title":"Improved environmental chemistry property prediction of molecules with graph machine learning†","authors":"Shang Zhu, Bichlien H. Nguyen, Yingce Xia, Kali Frost, Shufang Xie, Venkatasubramanian Viswanathan and Jake A. Smith","doi":"10.1039/D3GC01920A","DOIUrl":null,"url":null,"abstract":"<p >Rapid prediction of environmental chemistry properties is critical for the green and sustainable development of the chemical industry and drug discovery. Machine learning methods can be applied to learn the relations between chemical structures and their environmental impact. Graph machine learning, by learning the representations directly from molecular graphs, may have better predictive power than conventional feature-based models. In this work, we leveraged graph neural networks to predict the environmental chemistry properties of molecules. To systematically evaluate the model performance, we selected a representative list of datasets, ranging from solubility to reactivity, and compared them directly to commonly used methods. We found that the graph model achieved near state-of-the-art accuracy for all tasks and, for several, improved the accuracy by a large margin over conventional models that rely on human-designed chemical features. This demonstrates that graph machine learning can be a powerful tool to perform representation learning for environmental chemistry. Further, we compared the data efficiency of conventional feature-based models and graph neural networks, providing guidance for model selection dependent on the size of datasets and feature requirements.</p>","PeriodicalId":78,"journal":{"name":"Green Chemistry","volume":" 17","pages":" 6612-6617"},"PeriodicalIF":9.3000,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Green Chemistry","FirstCategoryId":"92","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2023/gc/d3gc01920a","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Rapid prediction of environmental chemistry properties is critical for the green and sustainable development of the chemical industry and drug discovery. Machine learning methods can be applied to learn the relations between chemical structures and their environmental impact. Graph machine learning, by learning the representations directly from molecular graphs, may have better predictive power than conventional feature-based models. In this work, we leveraged graph neural networks to predict the environmental chemistry properties of molecules. To systematically evaluate the model performance, we selected a representative list of datasets, ranging from solubility to reactivity, and compared them directly to commonly used methods. We found that the graph model achieved near state-of-the-art accuracy for all tasks and, for several, improved the accuracy by a large margin over conventional models that rely on human-designed chemical features. This demonstrates that graph machine learning can be a powerful tool to perform representation learning for environmental chemistry. Further, we compared the data efficiency of conventional feature-based models and graph neural networks, providing guidance for model selection dependent on the size of datasets and feature requirements.
期刊介绍:
Green Chemistry is a journal that provides a unique forum for the publication of innovative research on the development of alternative green and sustainable technologies. The scope of Green Chemistry is based on the definition proposed by Anastas and Warner (Green Chemistry: Theory and Practice, P T Anastas and J C Warner, Oxford University Press, Oxford, 1998), which defines green chemistry as the utilisation of a set of principles that reduces or eliminates the use or generation of hazardous substances in the design, manufacture and application of chemical products. Green Chemistry aims to reduce the environmental impact of the chemical enterprise by developing a technology base that is inherently non-toxic to living things and the environment. The journal welcomes submissions on all aspects of research relating to this endeavor and publishes original and significant cutting-edge research that is likely to be of wide general appeal. For a work to be published, it must present a significant advance in green chemistry, including a comparison with existing methods and a demonstration of advantages over those methods.