Gerardo M. Casanola-Martin, Anas Karuth, Hai Pham-The, Humbert González-Díaz, Dean C. Webster, Bakhtiyor Rasulev
{"title":"Machine learning analysis of a large set of homopolymers to predict glass transition temperatures","authors":"Gerardo M. Casanola-Martin, Anas Karuth, Hai Pham-The, Humbert González-Díaz, Dean C. Webster, Bakhtiyor Rasulev","doi":"10.1038/s42004-024-01305-0","DOIUrl":null,"url":null,"abstract":"Glass transition temperature of polymers, Tg, is an important thermophysical property, which sometimes can be difficult to measure experimentally. In this regard, data-driven machine learning approaches are important alternatives to assess Tg values, in a high-throughput way. In this study, a large dataset of more than 900 polymers with reported glass transition temperature (Tg) was assembled from various public sources in order to develop a predictive model depicting the structure-property relationships. The collected dataset was curated, explored via cluster analysis, and then split into training and test sets for validation purposes and then polymer structures characterized by molecular descriptors. To find the models, several machine learning techniques, including multiple linear regression (MLR), k-nearest neighbor (k-NN), support vector machine (SVM), random forest (RF), gaussian processes for regression (GPR), and multi-layer perceptron (MLP) were explored. As result, a model with the subset of 15 descriptors accurately predicting the glass transition temperatures was developed. The electronic effect indices were determined to be important properties that positively contribute to the Tg values. The SVM-based model showed the best performance with determination coefficients (R2) of 0.813 and 0.770, for training and test sets, respectively. Also, the SVM model showed the lowest estimation error, RMSE = 0.062. In addition, the developed structure-property model was implemented as a web app to be used as an online computational tool to design and evaluate new homopolymers with desired glass transition profiles. Glass transition temperatures (Tg) of polymers are important thermophysical descriptors, but they can be difficult to determine experimentally. Here, the authors develop a data-driven support vector machine structure-property model to assess Tg values in a high-throughput manner, and implement the model into a web app.","PeriodicalId":10529,"journal":{"name":"Communications Chemistry","volume":" ","pages":"1-9"},"PeriodicalIF":5.9000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42004-024-01305-0.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Chemistry","FirstCategoryId":"92","ListUrlMain":"https://www.nature.com/articles/s42004-024-01305-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Glass transition temperature of polymers, Tg, is an important thermophysical property, which sometimes can be difficult to measure experimentally. In this regard, data-driven machine learning approaches are important alternatives to assess Tg values, in a high-throughput way. In this study, a large dataset of more than 900 polymers with reported glass transition temperature (Tg) was assembled from various public sources in order to develop a predictive model depicting the structure-property relationships. The collected dataset was curated, explored via cluster analysis, and then split into training and test sets for validation purposes and then polymer structures characterized by molecular descriptors. To find the models, several machine learning techniques, including multiple linear regression (MLR), k-nearest neighbor (k-NN), support vector machine (SVM), random forest (RF), gaussian processes for regression (GPR), and multi-layer perceptron (MLP) were explored. As result, a model with the subset of 15 descriptors accurately predicting the glass transition temperatures was developed. The electronic effect indices were determined to be important properties that positively contribute to the Tg values. The SVM-based model showed the best performance with determination coefficients (R2) of 0.813 and 0.770, for training and test sets, respectively. Also, the SVM model showed the lowest estimation error, RMSE = 0.062. In addition, the developed structure-property model was implemented as a web app to be used as an online computational tool to design and evaluate new homopolymers with desired glass transition profiles. Glass transition temperatures (Tg) of polymers are important thermophysical descriptors, but they can be difficult to determine experimentally. Here, the authors develop a data-driven support vector machine structure-property model to assess Tg values in a high-throughput manner, and implement the model into a web app.
期刊介绍:
Communications Chemistry is an open access journal from Nature Research publishing high-quality research, reviews and commentary in all areas of the chemical sciences. Research papers published by the journal represent significant advances bringing new chemical insight to a specialized area of research. We also aim to provide a community forum for issues of importance to all chemists, regardless of sub-discipline.