OpenPoly: A Polymer Database Empowering Benchmarking and Multi-property Predictions

IF 4 2区 化学 Q2 POLYMER SCIENCE
Ji-Feng Wang, Yu-Bo Sun, Qiu-Tong Chen, Fei-Fan Ji, Yuan-Yuan Song, Meng-Yuan Ruan, Ying Wang
{"title":"OpenPoly: A Polymer Database Empowering Benchmarking and Multi-property Predictions","authors":"Ji-Feng Wang,&nbsp;Yu-Bo Sun,&nbsp;Qiu-Tong Chen,&nbsp;Fei-Fan Ji,&nbsp;Yuan-Yuan Song,&nbsp;Meng-Yuan Ruan,&nbsp;Ying Wang","doi":"10.1007/s10118-025-3402-y","DOIUrl":null,"url":null,"abstract":"<div><p>Advancing the integration of artificial intelligence and polymer science requires high-quality, open-source, and large-scale datasets. However, existing polymer databases often suffer from data sparsity, lack of polymer-property labels, and limited accessibility, hindering systematic modeling across property prediction tasks. Here, we present OpenPoly, a curated experimental polymer database derived from extensive literature mining and manual validation, comprising 3985 unique polymer-property data points spanning 26 key properties. We further develop a multi-task benchmarking framework that evaluates property prediction using four encoding methods and eight representative models. Our results highlight that the optimized degree-of-polymerization encoding coupled with Morgan fingerprints achieves an optimal trade-off between computational cost and accuracy. In data-scarce condition, XGBoost outperforms deep learning models on key properties such as dielectric constant, glass transition temperature, melting point, and mechanical strength, achieving R2 scores of 0.65—0.87. To further showcase the practical utility of the database, we propose potential polymers for two energy-relevant applications: high temperature polymer dielectrics and fuel cell membranes. By offering a consistent and accessible benchmark and database, OpenPoly paves the way for more accurate polymer-property modeling and fosters data-driven advances in polymer genome engineering.</p></div>","PeriodicalId":517,"journal":{"name":"Chinese Journal of Polymer Science","volume":"43 10","pages":"1749 - 1760"},"PeriodicalIF":4.0000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Polymer Science","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1007/s10118-025-3402-y","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"POLYMER SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Advancing the integration of artificial intelligence and polymer science requires high-quality, open-source, and large-scale datasets. However, existing polymer databases often suffer from data sparsity, lack of polymer-property labels, and limited accessibility, hindering systematic modeling across property prediction tasks. Here, we present OpenPoly, a curated experimental polymer database derived from extensive literature mining and manual validation, comprising 3985 unique polymer-property data points spanning 26 key properties. We further develop a multi-task benchmarking framework that evaluates property prediction using four encoding methods and eight representative models. Our results highlight that the optimized degree-of-polymerization encoding coupled with Morgan fingerprints achieves an optimal trade-off between computational cost and accuracy. In data-scarce condition, XGBoost outperforms deep learning models on key properties such as dielectric constant, glass transition temperature, melting point, and mechanical strength, achieving R2 scores of 0.65—0.87. To further showcase the practical utility of the database, we propose potential polymers for two energy-relevant applications: high temperature polymer dielectrics and fuel cell membranes. By offering a consistent and accessible benchmark and database, OpenPoly paves the way for more accurate polymer-property modeling and fosters data-driven advances in polymer genome engineering.

OpenPoly:一个支持基准测试和多属性预测的聚合物数据库
推进人工智能与聚合物科学的融合需要高质量、开源和大规模的数据集。然而,现有的聚合物数据库通常存在数据稀疏、缺乏聚合物性质标签和有限的可访问性等问题,阻碍了性能预测任务的系统建模。在这里,我们提出了OpenPoly,这是一个精心策划的实验聚合物数据库,源自广泛的文献挖掘和人工验证,包括3985个独特的聚合物属性数据点,涵盖26个关键属性。我们进一步开发了一个多任务基准框架,该框架使用四种编码方法和八个代表性模型来评估属性预测。我们的研究结果表明,优化的聚合度编码与摩根指纹相结合,在计算成本和精度之间实现了最佳权衡。在数据稀缺的情况下,XGBoost在介电常数、玻璃化转变温度、熔点、机械强度等关键性能上优于深度学习模型,R2得分为0.65-0.87。为了进一步展示该数据库的实际用途,我们提出了两种与能源相关的潜在聚合物:高温聚合物电介质和燃料电池膜。通过提供一致且可访问的基准和数据库,OpenPoly为更准确的聚合物特性建模铺平了道路,并促进了聚合物基因组工程的数据驱动进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chinese Journal of Polymer Science
Chinese Journal of Polymer Science 化学-高分子科学
CiteScore
7.10
自引率
11.60%
发文量
218
审稿时长
6.0 months
期刊介绍: Chinese Journal of Polymer Science (CJPS) is a monthly journal published in English and sponsored by the Chinese Chemical Society and the Institute of Chemistry, Chinese Academy of Sciences. CJPS is edited by a distinguished Editorial Board headed by Professor Qi-Feng Zhou and supported by an International Advisory Board in which many famous active polymer scientists all over the world are included. The journal was first published in 1983 under the title Polymer Communications and has the current name since 1985. CJPS is a peer-reviewed journal dedicated to the timely publication of original research ideas and results in the field of polymer science. The issues may carry regular papers, rapid communications and notes as well as feature articles. As a leading polymer journal in China published in English, CJPS reflects the new achievements obtained in various laboratories of China, CJPS also includes papers submitted by scientists of different countries and regions outside of China, reflecting the international nature of the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信