Digital discovery最新文献

筛选
英文 中文
Auto-generating question-answering datasets with domain-specific knowledge for language models in scientific tasks† 自动生成问答数据集与领域特定知识的语言模型在科学任务†
IF 6.2
Digital discovery Pub Date : 2025-02-24 DOI: 10.1039/D4DD00307A
Zongqian Li and Jacqueline M. Cole
{"title":"Auto-generating question-answering datasets with domain-specific knowledge for language models in scientific tasks†","authors":"Zongqian Li and Jacqueline M. Cole","doi":"10.1039/D4DD00307A","DOIUrl":"https://doi.org/10.1039/D4DD00307A","url":null,"abstract":"<p >Large language models (LLMs) have emerged as a useful tool for the public to process and respond to a vast range of interactive text-based queries. While foundational LLMs are well suited to making general user queries, smaller language models that have been trained on custom text from a specific domain of interest tend to display superior performance on queries about that domain, can operate faster and improve efficiency. Nonetheless, considerable resources are still needed to pre-train a language model with custom data. We present a pipeline that shows a way to overcome this need for pre-training. The pipeline first uses new algorithms that we have designed to produce a large, high-quality question-answering dataset (SCQA) for a particular domain of interest, solar cells. These algorithms employed a solar-cell database that had been auto-generated using the ‘chemistry-aware’ natural language processing tool, ChemDataExtractor. In turn, this SCQA dataset is used to fine-tune language models, whose resulting <em>F</em><small><sub>1</sub></small>-scores of performance far exceed (by 10–20%) those of analogous language models that have been fine-tuned against a general-English language QA dataset, SQuAD. Importantly, the performance of the language models fine-tuned against the SCQA dataset does not depend on the size of their architecture, whether or not the tokens were cased or uncased or whether or not the foundational language models were further pre-trained with domain-specific data or fine-tuned directly from their vanilla state. This shows that this domain-specific SCQA dataset produced by our algorithms has sufficient intrinsic domain knowledge to be directly fine-tuned against a foundational language model for immediate use with improved performance.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 998-1005"},"PeriodicalIF":6.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00307a?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SANE: strategic autonomous non-smooth exploration for multiple optima discovery in multi-modal and non-differentiable black-box functions† 多模态不可微黑盒函数中多最优解发现的策略自主非平滑探索
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00299G
Arpan Biswas, Rama Vasudevan, Rohit Pant, Ichiro Takeuchi, Hiroshi Funakubo and Yongtao Liu
{"title":"SANE: strategic autonomous non-smooth exploration for multiple optima discovery in multi-modal and non-differentiable black-box functions†","authors":"Arpan Biswas, Rama Vasudevan, Rohit Pant, Ichiro Takeuchi, Hiroshi Funakubo and Yongtao Liu","doi":"10.1039/D4DD00299G","DOIUrl":"https://doi.org/10.1039/D4DD00299G","url":null,"abstract":"<p >Both computational and experimental material discovery bring forth the challenge of exploring multidimensional and multimodal parameter spaces, such as phase diagrams of Hamiltonians with multiple interactions, composition spaces of combinatorial libraries, material structure image spaces, and molecular embedding spaces. Often these systems are black-boxes and time-consuming to evaluate, which resulted in strong interest towards active learning methods such as Bayesian optimization (BO). However, these systems are often noisy which make the black box function severely multi-modal and non-differentiable, where a vanilla BO can get overly focused near a single or faux optimum, deviating from the broader goal of scientific discovery. To address these limitations, here we developed Strategic Autonomous Non-Smooth Exploration (SANE) to facilitate an intelligent Bayesian optimized navigation with a proposed cost-driven probabilistic acquisition function to find multiple global and local optimal regions, avoiding the tendency to becoming trapped in a single optimum. To distinguish between a true and false optimal region due to noisy experimental measurements, a human (domain) knowledge driven dynamic surrogate gate is integrated with SANE. We implemented the gate-SANE into pre-acquired piezoresponse spectroscopy data of a ferroelectric combinatorial library with high noise levels in specific regions, and piezoresponse force microscopy (PFM) hyperspectral data. SANE demonstrated better performance than classical BO to facilitate the exploration of multiple optimal regions and thereby prioritized learning with higher coverage of scientific values in autonomous experiments. Our work showcases the potential application of this method to real-world experiments, where such combined strategic and human intervening approaches can be critical to unlocking new discoveries in autonomous research.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 853-867"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00299g?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dissecting errors in machine learning for retrosynthesis: a granular metric framework and a transformer-based model for more informative predictions 反合成机器学习中的剖析错误:一个粒度度量框架和一个基于变压器的模型,用于提供更多信息的预测
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00263F
Arihanth Srikar Tadanki, H. Surya Prakash Rao and U. Deva Priyakumar
{"title":"Dissecting errors in machine learning for retrosynthesis: a granular metric framework and a transformer-based model for more informative predictions","authors":"Arihanth Srikar Tadanki, H. Surya Prakash Rao and U. Deva Priyakumar","doi":"10.1039/D4DD00263F","DOIUrl":"https://doi.org/10.1039/D4DD00263F","url":null,"abstract":"<p >Chemical reaction prediction, encompassing forward synthesis and retrosynthesis, stands as a fundamental challenge in organic synthesis. A widely adopted computational approach frames synthesis prediction as a sequence-to-sequence translation task, using the commonly used SMILES representation for molecules. The current evaluation of machine learning methods for retrosynthesis assumes perfect training data, overlooking imperfections in reaction equations in popular datasets, such as missing reactants, products, other physical and practical constraints such as temperature and cost, primarily due to a focus on the target molecule. This limitation leads to an incomplete representation of viable synthetic routes, especially when multiple sets of reactants can yield a given desired product. In response to these shortcomings, this study examines the prevailing evaluation methods and introduces comprehensive metrics designed to address imperfections in the dataset. Our novel metrics not only assess absolute accuracy by comparing predicted outputs with ground truth but also introduce a nuanced evaluation approach. We provide scores for partial correctness and compute adjusted accuracy through graph matching, acknowledging the inherent complexities of retrosynthetic pathways. Additionally, we explore the impact of small molecular augmentations while preserving chemical properties and employ similarity matching to enhance the assessment of prediction quality. We introduce SynFormer, a sequence-to-sequence model tailored for SMILES representation. It incorporates architectural enhancements to the original transformer, effectively tackling the challenges of chemical reaction prediction. SynFormer achieves a Top-1 accuracy of 53.2% on the USPTO-50k dataset, matching the performance of widely accepted models like Chemformer, but with greater efficiency by eliminating the need for pre-training.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 831-845"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00263f?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital workflow optimization of van der Waals methods for improved halide perovskite solar materials† 改进卤化物钙钛矿太阳能材料的范德华方法的数字工作流程优化
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00312H
Celso R. C. Rêgo, Wolfgang Wenzel, Maurício J. Piotrowski, Alexandre C. Dias, Carlos Maciel de Oliveira Bastos, Luis O. de Araujo and Diego Guedes-Sobrinho
{"title":"Digital workflow optimization of van der Waals methods for improved halide perovskite solar materials†","authors":"Celso R. C. Rêgo, Wolfgang Wenzel, Maurício J. Piotrowski, Alexandre C. Dias, Carlos Maciel de Oliveira Bastos, Luis O. de Araujo and Diego Guedes-Sobrinho","doi":"10.1039/D4DD00312H","DOIUrl":"https://doi.org/10.1039/D4DD00312H","url":null,"abstract":"<p >Hybrid organic–inorganic metal halide perovskites are low-cost and highly efficient materials used in solar cell devices. However, the intricacies of perovskites that merge organic cations with inorganic frameworks necessitate further elucidation, particularly from the long-range van der Waals perspective. Here, we scrutinize the van der Waals (vdW) methods by conceptualizing organic cations for XH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> and CH<small><sub>3</sub></small>XH<small><sub>3</sub></small>PbI<small><sub>3</sub></small> prototype perovskites (X = N, P, As, and Sb), to investigate the thermodynamic stability. To handle the enormous amount of raw data generated from DFT + vdW + SOC with DFT-1/2 (quasi-particle correction method), we have used the SimStack workflow framework, which enhanced the efficiency, reproducibility, and data transferability. The results reveal the critical role of the organic cations, inferred from ionic radius estimates and documented electronegativity, in elucidating the accommodation of symmetric XH<small><sub>4</sub></small><small><sup>+</sup></small> or asymmetric CH<small><sub>3</sub></small>XH<small><sub>3</sub></small><small><sup>+</sup></small> cations within the limited volumes of cuboctahedral cavities. The discrepancy in the ionic size within the XH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> (CH<small><sub>3</sub></small>XH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>) group positions NH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> (CH<small><sub>3</sub></small>NH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>) outside (within) the stable perovskite region suggests the theoretical viability of perovskites containing phosphonium, arsonium, and stibonium beyond CH<small><sub>3</sub></small>NH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>. As we move from N to Sb, the organic cation's properties, such as ionic radius and electronegativity, affect the thermodynamic stability and local geometry of octahedra, directly influencing the band gaps.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 927-942"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00312h?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active learning high coverage sets of complementary reaction conditions† 主动学习高覆盖率集互补反应条件†
IF 6.2
Digital discovery Pub Date : 2025-02-17 DOI: 10.1039/D4DD00365A
Sofia L. Sivilotti, David M. Friday and Nicholas E. Jackson
{"title":"Active learning high coverage sets of complementary reaction conditions†","authors":"Sofia L. Sivilotti, David M. Friday and Nicholas E. Jackson","doi":"10.1039/D4DD00365A","DOIUrl":"https://doi.org/10.1039/D4DD00365A","url":null,"abstract":"<p >Chemical reaction conditions capable of producing high yields over diverse reactants are a desired component of nearly all chemical and materials discovery campaigns. While much work has been done to discover individual general reaction conditions, any single conditions are necessarily limited over increasingly diverse chemical spaces. A potential solution to this problem is to identify small sets of complementary reaction conditions that, when combined, cover a larger chemical space than any one general reaction condition. In this work, we analyze experimentally derived datasets to assess the relative performance of individual general reaction conditions <em>vs.</em> sets of complementary reaction conditions. We then propose and benchmark active learning methods to efficiently discover these complimentary sets of conditions. The results show the value of active learning in identifying complementary sets of reaction conditions and provide an avenue for improving synthetic hit rates in high-throughput synthesis campaigns.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 846-852"},"PeriodicalIF":6.2,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00365a?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-powered exploration of molecular vibrations, phonons, and spectroscopy 人工智能驱动的分子振动、声子和光谱学探索
IF 6.2
Digital discovery Pub Date : 2025-02-14 DOI: 10.1039/D4DD00353E
Bowen Han, Ryotaro Okabe, Abhijatmedhi Chotrattanapituk, Mouyang Cheng, Mingda Li and Yongqiang Cheng
{"title":"AI-powered exploration of molecular vibrations, phonons, and spectroscopy","authors":"Bowen Han, Ryotaro Okabe, Abhijatmedhi Chotrattanapituk, Mouyang Cheng, Mingda Li and Yongqiang Cheng","doi":"10.1039/D4DD00353E","DOIUrl":"https://doi.org/10.1039/D4DD00353E","url":null,"abstract":"<p >The vibrational dynamics of molecules and solids play a critical role in defining material properties, particularly their thermal behaviors. However, theoretical calculations of these dynamics are often computationally intensive, while experimental approaches can be technically complex and resource-demanding. Recent advancements in data-driven artificial intelligence (AI) methodologies have substantially enhanced the efficiency of these studies. This review explores the latest progress in AI-driven methods for investigating atomic vibrations, emphasizing their role in accelerating computations and enabling rapid predictions of lattice dynamics, phonon behaviors, molecular dynamics, and vibrational spectra. Key developments are discussed, including advancements in databases, structural representations, machine-learning interatomic potentials, graph neural networks, and other emerging approaches. Compared to traditional techniques, AI methods exhibit transformative potential, dramatically improving the efficiency and scope of research in materials science. The review concludes by highlighting the promising future of AI-driven innovations in the study of atomic vibrations.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 584-624"},"PeriodicalIF":6.2,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00353e?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a universal scaling method for predicting equilibrium constants of polyoxometalates† 开发预测多氧金属盐平衡常数的通用缩放方法†。
IF 6.2
Digital discovery Pub Date : 2025-02-14 DOI: 10.1039/D4DD00358F
Jordi Buils, Diego Garay-Ruiz, Enric Petrus, Mireia Segado-Centellas and Carles Bo
{"title":"Towards a universal scaling method for predicting equilibrium constants of polyoxometalates†","authors":"Jordi Buils, Diego Garay-Ruiz, Enric Petrus, Mireia Segado-Centellas and Carles Bo","doi":"10.1039/D4DD00358F","DOIUrl":"https://doi.org/10.1039/D4DD00358F","url":null,"abstract":"<p >The computational prediction of equilibrium constants is still an open problem for a wide variety of relevant chemical systems. In particular, acid dissociation constants (p<em>K</em><small><sub>a</sub></small>) are an essential asset in biological, synthetic and industrial chemistry whose prediction encounters several difficulties, requiring the development of novel strategies. The self-assembly of polyoxometalates (POMs) is another complex problem where acid-base reactions play a central role; the successful prediction of the formation constants of these structures is intimately linked with the limitations of p<em>K</em><small><sub>a</sub></small> determination. Our methodology POMSimulator enables the prediction of these polyoxometalate formation constants from Density Functional Theory (DFT) calculations, using the experimental <em>K</em><small><sub>f</sub></small> values available in the literature to fit the resulting predictions. In this work, we carry out a systematic analysis of a very large number of POM formation constants already predicted through the application of POMSimulator. We then propose a universal scaling scheme for the adjustment of the DFT-based formation constants of POMs, relying on a linear scaling of the form <em>y</em> = <em>mx</em> + <em>b</em>. Here, the slope (<em>m</em>) is a constant parameter – hence, universal towards the nature of the polyoxometalate and the calculation method. The intercept (<em>b</em>), in contrast, is a system-dependent parameter that can be predicted with a multi-linear regression model trained with statistical aggregates of the non-scaled formation constants. Thus, we are able to successfully predict the speciation and phase diagrams of POM systems for which available experimental data are minimal, as well as provide a general scaling scheme that might be extended to other kinds of chemical systems.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 970-978"},"PeriodicalIF":6.2,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00358f?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative evaluation of anharmonic bond potentials for molecular simulations† 分子模拟中非调和键势的定量评价
IF 6.2
Digital discovery Pub Date : 2025-02-13 DOI: 10.1039/D4DD00344F
Paul J. van Maaren and David van der Spoel
{"title":"Quantitative evaluation of anharmonic bond potentials for molecular simulations†","authors":"Paul J. van Maaren and David van der Spoel","doi":"10.1039/D4DD00344F","DOIUrl":"https://doi.org/10.1039/D4DD00344F","url":null,"abstract":"<p >Most general force fields only implement a harmonic potential to model covalent bonds. In addition, in some force fields, all or a selection of the covalent bonds are constrained in molecular dynamics simulations. Nevertheless, it is possible to implement accurate bond potentials for a relatively small computational cost. Such potentials may be important for spectroscopic applications, free energy perturbation calculations or for studying reactions using empirical valence bond theory. Here, we evaluate different bond potentials for diatomic molecules. Based on quantum-chemical scans around the equilibrium distance of 71 molecules using the MP2/aug-cc-pVTZ level of theory as well as CCSD(T) with the same basis-set, we determine the quality of fit to the data of 28 model potentials. As expected, a large spread in accuracies of the potentials is found and more complex potentials generally provide a better fit. As a second and more challenging test, five spectroscopic parameters (<em>ω</em><small><sub>e</sub></small>, <em>ω</em><small><sub>e</sub></small><em>x</em><small><sub>e</sub></small>, <em>α</em><small><sub>e</sub></small>, <em>B</em><small><sub>e</sub></small> and <em>D</em><small><sub>e</sub></small>) predicted based on quantum chemistry as well as the fitted potentials are compared to experimental data. A handful of the 28 potentials tested are found to be accurate. Of these, we suggest that the potential due to Hua (<em>Phys. Rev. A</em>, <strong>42</strong> (1990), 2524) could be a suitable choice for implementation in molecular simulations codes, since it is considerably more accurate than the well-known Morse potential (<em>Phys. Rev.</em>, <strong>34</strong> (1929), 57) at a very similar computational cost.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 824-830"},"PeriodicalIF":6.2,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00344f?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Distortion/interaction analysis via machine learning 修正:通过机器学习进行失真/交互分析
IF 6.2
Digital discovery Pub Date : 2025-02-06 DOI: 10.1039/D5DD90005K
Samuel G. Espley, Samuel S. Allsop, David Buttar, Simone Tomasi and Matthew N. Grayson
{"title":"Correction: Distortion/interaction analysis via machine learning","authors":"Samuel G. Espley, Samuel S. Allsop, David Buttar, Simone Tomasi and Matthew N. Grayson","doi":"10.1039/D5DD90005K","DOIUrl":"https://doi.org/10.1039/D5DD90005K","url":null,"abstract":"<p >Correction for ‘Distortion/interaction analysis <em>via</em> machine learning’ by Samuel G. Espley <em>et al.</em>, <em>Digital Discovery</em>, 2024, <strong>3</strong>, 2479–2486, https://doi.org/10.1039/D4DD00224E.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 879-879"},"PeriodicalIF":6.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd90005k?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active learning-guided exploration of thermally conductive polymers under strain† 应变下热传导聚合物的主动学习引导探索
IF 6.2
Digital discovery Pub Date : 2025-02-06 DOI: 10.1039/D4DD00267A
Renzheng Zhang, Jiaxin Xu, Hanfeng Zhang, Guoyue Xu and Tengfei Luo
{"title":"Active learning-guided exploration of thermally conductive polymers under strain†","authors":"Renzheng Zhang, Jiaxin Xu, Hanfeng Zhang, Guoyue Xu and Tengfei Luo","doi":"10.1039/D4DD00267A","DOIUrl":"https://doi.org/10.1039/D4DD00267A","url":null,"abstract":"<p >Finding amorphous polymers with higher thermal conductivity (TC) is technologically important, as they are ubiquitous in applications where heat transfer is crucial. While TC is generally low in amorphous polymers, it can be enhanced by mechanical strain, which facilitates the alignment of polymer chains. However, using the conventional Edisonian approach, the discovery of polymers that may have high TC after strain can be time-consuming, with no guarantee of success. In this work, we employ an active learning scheme to speed up the discovery of amorphous polymers with high TC under strain. Polymers under 2× strain are simulated using molecular dynamics (MD), and their TCs are calculated using non-equilibrium MD. A Gaussian process gegression (GPR) model is then built using these MD data as the training set. The GPR model is used to screen the PoLyInfo database, and the predicted mean TC and uncertainty are used towards an acquisition function to recommend new polymers for labeling <em>via</em> Bayesian optimization. The TCs of these selected polymers are then labeled using MD simulations, and the obtained data are incorporated to rebuild the GPR model, initiating a new iteration of the active learning cycle. Over a few cycles, we identified ten strained polymers with significantly higher TC (&gt;1 W mK<small><sup>−1</sup></small>) than the original dataset, and the results offer valuable insights into the structural characteristics favorable for achieving high TC of polymers subject to strain.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 812-823"},"PeriodicalIF":6.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00267a?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信