Machine learning prediction of heat capacity of polymers as a function of temperature

IF 4.5 2区 化学 Q2 POLYMER SCIENCE
Kazuhiko Ishikiriyama
{"title":"Machine learning prediction of heat capacity of polymers as a function of temperature","authors":"Kazuhiko Ishikiriyama","doi":"10.1016/j.polymer.2025.129171","DOIUrl":null,"url":null,"abstract":"Machine learning models were developed using the high-quality ATHAS (Advanced Thermal Analysis System) data bank to predict the constant-pressure heat capacity (<em>C</em><sub><em>P</em></sub>) of polymers at 10 K intervals from 10 to 500 K. Molecular fingerprints (FPs) were used as features; specifically, circular Morgan fingerprints with a bond diameter of 4 derived from the repeating structural units of polymers. For polymers contained in the ATHAS data bank (e.g., polypropylene and polyamide 6), the predicted <em>C</em><sub><em>P</em></sub> values showed mean relative errors (MREs) within ±3%. In contrast, for polymers absent from the data bank—including poly(<em>p</em>-dioxanone), poly(<em>N</em>-vinylpyrrolidone), and starch—a positive correlation was observed between MRE and the number of missing substructures (<em>N</em><sub>ms</sub>), defined as hashed identifiers present in the target polymer but absent from the ATHAS-derived feature space. Using this correlation, <em>C</em><sub><em>P</em></sub> predictions for polymers with <em>N</em><sub>ms</sub> &gt; 0 were adjusted, reducing the MREs to within ±3%. To improve accuracy, additional models employing alternative FPs were built: polyBERT FP, generated from a pre-trained BERT-based chemical language model, and OMG FP and SMiPoly FP, derived from the virtual polymer libraries OMG and SMiPoly. For polymers with <em>N</em><sub>ms</sub> &gt; 0, all alternative FPs yielded lower MREs than uncorrected Morgan fingerprints. The lowest MREs were achieved using a hybrid FP constructed from OMG and 10% of the SMiPoly dataset, demonstrating enhanced extrapolative performance. Due to computational limits, molecular dynamics struggles to capture this temperature dependence, whereas trained machine learning models may rapidly predict it for many polymers, suggesting their potential as a practical alternative.","PeriodicalId":405,"journal":{"name":"Polymer","volume":"214 1","pages":""},"PeriodicalIF":4.5000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Polymer","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1016/j.polymer.2025.129171","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"POLYMER SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning models were developed using the high-quality ATHAS (Advanced Thermal Analysis System) data bank to predict the constant-pressure heat capacity (CP) of polymers at 10 K intervals from 10 to 500 K. Molecular fingerprints (FPs) were used as features; specifically, circular Morgan fingerprints with a bond diameter of 4 derived from the repeating structural units of polymers. For polymers contained in the ATHAS data bank (e.g., polypropylene and polyamide 6), the predicted CP values showed mean relative errors (MREs) within ±3%. In contrast, for polymers absent from the data bank—including poly(p-dioxanone), poly(N-vinylpyrrolidone), and starch—a positive correlation was observed between MRE and the number of missing substructures (Nms), defined as hashed identifiers present in the target polymer but absent from the ATHAS-derived feature space. Using this correlation, CP predictions for polymers with Nms > 0 were adjusted, reducing the MREs to within ±3%. To improve accuracy, additional models employing alternative FPs were built: polyBERT FP, generated from a pre-trained BERT-based chemical language model, and OMG FP and SMiPoly FP, derived from the virtual polymer libraries OMG and SMiPoly. For polymers with Nms > 0, all alternative FPs yielded lower MREs than uncorrected Morgan fingerprints. The lowest MREs were achieved using a hybrid FP constructed from OMG and 10% of the SMiPoly dataset, demonstrating enhanced extrapolative performance. Due to computational limits, molecular dynamics struggles to capture this temperature dependence, whereas trained machine learning models may rapidly predict it for many polymers, suggesting their potential as a practical alternative.

Abstract Image

机器学习预测聚合物的热容量作为温度的函数
使用高质量的ATHAS(高级热分析系统)数据库开发机器学习模型,以预测聚合物在10至500 K的10 K间隔内的恒压热容(CP)。分子指纹(FPs)作为特征;具体来说,键直径为4的圆形摩根指纹来自于聚合物的重复结构单元。对于ATHAS数据库中包含的聚合物(例如聚丙烯和聚酰胺6),预测CP值的平均相对误差(MREs)在±3%以内。相比之下,对于数据库中缺失的聚合物,包括聚(对二氧环酮)、聚(n -乙烯基吡咯烷酮)和淀粉,MRE与缺失亚结构(Nms)的数量呈正相关,Nms被定义为存在于目标聚合物中但不存在于athas衍生的特征空间中的散列标识符。利用这种相关性,对nms>; 0聚合物的CP预测进行了调整,将MREs降低到±3%以内。为了提高准确性,我们还构建了其他使用替代FPs的模型:polyBERT FP,由预训练的基于bert的化学语言模型生成,OMG FP和SMiPoly FP,源自虚拟聚合物库OMG和SMiPoly。对于具有Nms >; 0的聚合物,所有替代FPs产生的MREs都低于未校正的Morgan指纹。使用由OMG和10%的SMiPoly数据集构建的混合FP实现了最低的MREs,展示了增强的外推性能。由于计算的限制,分子动力学很难捕捉这种温度依赖性,而训练有素的机器学习模型可以快速预测许多聚合物的温度依赖性,这表明它们有可能成为一种实用的替代方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Polymer
Polymer 化学-高分子科学
CiteScore
7.90
自引率
8.70%
发文量
959
审稿时长
32 days
期刊介绍: Polymer is an interdisciplinary journal dedicated to publishing innovative and significant advances in Polymer Physics, Chemistry and Technology. We welcome submissions on polymer hybrids, nanocomposites, characterisation and self-assembly. Polymer also publishes work on the technological application of polymers in energy and optoelectronics. The main scope is covered but not limited to the following core areas: Polymer Materials Nanocomposites and hybrid nanomaterials Polymer blends, films, fibres, networks and porous materials Physical Characterization Characterisation, modelling and simulation* of molecular and materials properties in bulk, solution, and thin films Polymer Engineering Advanced multiscale processing methods Polymer Synthesis, Modification and Self-assembly Including designer polymer architectures, mechanisms and kinetics, and supramolecular polymerization Technological Applications Polymers for energy generation and storage Polymer membranes for separation technology Polymers for opto- and microelectronics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信