Predicting homopolymer and copolymer solubility through machine learning†

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY

Digital discovery Pub Date : 2024-12-24 DOI:10.1039/D4DD00290C

Christopher D. Stubbs, Yeonjoon Kim, Ethan C. Quinn, Raúl Pérez-Soto, Eugene Y.-X. Chen and Seonah Kim

{"title":"Predicting homopolymer and copolymer solubility through machine learning†","authors":"Christopher D. Stubbs, Yeonjoon Kim, Ethan C. Quinn, Raúl Pérez-Soto, Eugene Y.-X. Chen and Seonah Kim","doi":"10.1039/D4DD00290C","DOIUrl":null,"url":null,"abstract":"<p >Polymer solubility has applications in many important and diverse fields, including microprocessor fabrication, environmental conservation, paint formulation, and drug delivery, but it remains under-explored compared to its relative importance. This can be seen in the relative scarcity of solvent-based systems for recycling plastics, despite a need for efficient and selective methods amid the looming plastics and climate crises. Towards this need for better predictive tools, this work examines the use of classical and deep machine learning (ML) models for predicting categorical solubility in homopolymers and copolymers, with model architectures including random forest (RF), decision tree (DT), naive Bayes, AdaBoost, and graph neural networks (GNNs). We achieve high accuracy for both our homopolymer (82%, RF) and copolymer models (92%, RF) on unseen polymer–solvent systems in our 5-fold cross-validation studies. The relevance and applicability of our homopolymer models are then verified through in-house experiments examining the solubility of common commercial plastics, followed by an explainable AI (XAI) analysis using Shapley Additive Explanations (SHAP), which explores the relative contribution of each feature toward model predictions. We then apply our homopolymer solubility prediction model to remove unwanted or hazardous additives in polyethylene (PE) and polystyrene (PS) waste. This work demonstrates the validity/feasibility of using ML to predict homopolymer solubility, provides novel ML models for the prediction of copolymer solubility, and explains homopolymer model predictions before applying the explained model to a globally relevant waste challenge.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 2","pages":" 424-437"},"PeriodicalIF":6.2000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00290c?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d4dd00290c","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Polymer solubility has applications in many important and diverse fields, including microprocessor fabrication, environmental conservation, paint formulation, and drug delivery, but it remains under-explored compared to its relative importance. This can be seen in the relative scarcity of solvent-based systems for recycling plastics, despite a need for efficient and selective methods amid the looming plastics and climate crises. Towards this need for better predictive tools, this work examines the use of classical and deep machine learning (ML) models for predicting categorical solubility in homopolymers and copolymers, with model architectures including random forest (RF), decision tree (DT), naive Bayes, AdaBoost, and graph neural networks (GNNs). We achieve high accuracy for both our homopolymer (82%, RF) and copolymer models (92%, RF) on unseen polymer–solvent systems in our 5-fold cross-validation studies. The relevance and applicability of our homopolymer models are then verified through in-house experiments examining the solubility of common commercial plastics, followed by an explainable AI (XAI) analysis using Shapley Additive Explanations (SHAP), which explores the relative contribution of each feature toward model predictions. We then apply our homopolymer solubility prediction model to remove unwanted or hazardous additives in polyethylene (PE) and polystyrene (PS) waste. This work demonstrates the validity/feasibility of using ML to predict homopolymer solubility, provides novel ML models for the prediction of copolymer solubility, and explains homopolymer model predictions before applying the explained model to a globally relevant waste challenge.

Abstract Image

查看原文本刊更多论文

通过机器学习预测均聚物和共聚物的溶解度

聚合物溶解度在许多重要和不同的领域都有应用，包括微处理器制造、环境保护、油漆配方和药物输送，但与它的相对重要性相比，它仍未得到充分的探索。尽管在迫在眉睫的塑料和气候危机中需要高效和有选择性的方法，但用于回收塑料的溶剂型系统的相对稀缺性可以看出这一点。为了满足对更好的预测工具的需求，这项工作研究了经典和深度机器学习（ML）模型的使用，用于预测均聚物和共聚物的分类溶解度，模型架构包括随机森林（RF）、决策树（DT）、朴素贝叶斯、AdaBoost和图神经网络（gnn）。在我们的5倍交叉验证研究中，我们的均聚物（82%，RF）和共聚物模型（92%，RF）在未见过的聚合物溶剂体系上实现了高精度。我们的均聚物模型的相关性和适用性随后通过检查普通商业塑料溶解度的内部实验进行验证，随后使用Shapley添加剂解释（SHAP）进行可解释的AI （XAI）分析，该分析探讨了每个特征对模型预测的相对贡献。然后，我们应用我们的均聚物溶解度预测模型来去除聚乙烯（PE）和聚苯乙烯（PS）废物中不需要的或有害的添加剂。这项工作证明了使用ML预测均聚物溶解度的有效性/可行性，为预测共聚物溶解度提供了新的ML模型，并在将所解释的模型应用于全球相关的废物挑战之前解释了均聚物模型预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital discovery

CiteScore

2.80

自引率

0.00%

发文量