{"title":"通过分子表征学习对纯有机物进行高精度物理性质预测:连接数据与发现","authors":"Qi Ou, Hongshuai Wang, Minyang Zhuang, Shangqian Chen, Lele Liu, Ning Wang, Zhifeng Gao","doi":"10.1038/s41524-025-01720-4","DOIUrl":null,"url":null,"abstract":"<p>The escalating energy crisis has spurred extensive research into organic compounds for energy-efficient applications, taking advantage of their environmental friendliness, cost-effective synthesis, and adaptable molecular structures. Traditional trial-and-error methods for discovering highly functional organic compounds are expensive and time-consuming. We employed a 3D transformer-based molecular representation learning algorithm to create the Org-Mol pre-trained model, using 60 million semi-empirically optimized small organic molecule structures. After fine-tuning with public experimental data, the model can accurately predict various physical properties of pure organics, with test set <i>R</i><sup>2</sup> values exceeding 0.92. These fine-tuned models are used in high-throughput screening among millions of ester molecules to identify novel immersion coolants, resulting in the experimental validation of two promising candidates. This work not only demonstrates the potential of Org-Mol in predicting bulk properties for pure organic compounds but also paves the way for the rational and efficient development of ideal candidates for energy-saving materials.</p>","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"22 1","pages":""},"PeriodicalIF":11.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-accuracy physical property prediction for pure organics via molecular representation learning: bridging data to discovery\",\"authors\":\"Qi Ou, Hongshuai Wang, Minyang Zhuang, Shangqian Chen, Lele Liu, Ning Wang, Zhifeng Gao\",\"doi\":\"10.1038/s41524-025-01720-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The escalating energy crisis has spurred extensive research into organic compounds for energy-efficient applications, taking advantage of their environmental friendliness, cost-effective synthesis, and adaptable molecular structures. Traditional trial-and-error methods for discovering highly functional organic compounds are expensive and time-consuming. We employed a 3D transformer-based molecular representation learning algorithm to create the Org-Mol pre-trained model, using 60 million semi-empirically optimized small organic molecule structures. After fine-tuning with public experimental data, the model can accurately predict various physical properties of pure organics, with test set <i>R</i><sup>2</sup> values exceeding 0.92. These fine-tuned models are used in high-throughput screening among millions of ester molecules to identify novel immersion coolants, resulting in the experimental validation of two promising candidates. This work not only demonstrates the potential of Org-Mol in predicting bulk properties for pure organic compounds but also paves the way for the rational and efficient development of ideal candidates for energy-saving materials.</p>\",\"PeriodicalId\":19342,\"journal\":{\"name\":\"npj Computational Materials\",\"volume\":\"22 1\",\"pages\":\"\"},\"PeriodicalIF\":11.9000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Computational Materials\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://doi.org/10.1038/s41524-025-01720-4\",\"RegionNum\":1,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Computational Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1038/s41524-025-01720-4","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
High-accuracy physical property prediction for pure organics via molecular representation learning: bridging data to discovery
The escalating energy crisis has spurred extensive research into organic compounds for energy-efficient applications, taking advantage of their environmental friendliness, cost-effective synthesis, and adaptable molecular structures. Traditional trial-and-error methods for discovering highly functional organic compounds are expensive and time-consuming. We employed a 3D transformer-based molecular representation learning algorithm to create the Org-Mol pre-trained model, using 60 million semi-empirically optimized small organic molecule structures. After fine-tuning with public experimental data, the model can accurately predict various physical properties of pure organics, with test set R2 values exceeding 0.92. These fine-tuned models are used in high-throughput screening among millions of ester molecules to identify novel immersion coolants, resulting in the experimental validation of two promising candidates. This work not only demonstrates the potential of Org-Mol in predicting bulk properties for pure organic compounds but also paves the way for the rational and efficient development of ideal candidates for energy-saving materials.
期刊介绍:
npj Computational Materials is a high-quality open access journal from Nature Research that publishes research papers applying computational approaches for the design of new materials and enhancing our understanding of existing ones. The journal also welcomes papers on new computational techniques and the refinement of current approaches that support these aims, as well as experimental papers that complement computational findings.
Some key features of npj Computational Materials include a 2-year impact factor of 12.241 (2021), article downloads of 1,138,590 (2021), and a fast turnaround time of 11 days from submission to the first editorial decision. The journal is indexed in various databases and services, including Chemical Abstracts Service (ACS), Astrophysics Data System (ADS), Current Contents/Physical, Chemical and Earth Sciences, Journal Citation Reports/Science Edition, SCOPUS, EI Compendex, INSPEC, Google Scholar, SCImago, DOAJ, CNKI, and Science Citation Index Expanded (SCIE), among others.