Digital discovery最新文献

筛选
英文 中文
Predicting performance of object detection models in electron microscopy using random forests† 利用随机森林预测电子显微镜中目标检测模型的性能
IF 6.2
Digital discovery Pub Date : 2025-03-04 DOI: 10.1039/D4DD00351A
Ni Li, Ryan Jacobs, Matthew Lynch, Vidit Agrawal, Kevin Field and Dane Morgan
{"title":"Predicting performance of object detection models in electron microscopy using random forests†","authors":"Ni Li, Ryan Jacobs, Matthew Lynch, Vidit Agrawal, Kevin Field and Dane Morgan","doi":"10.1039/D4DD00351A","DOIUrl":"https://doi.org/10.1039/D4DD00351A","url":null,"abstract":"<p >Quantifying prediction uncertainty when applying object detection models to new, unlabeled datasets is critical in applied machine learning. This study introduces an approach to estimate the performance of deep learning-based object detection models for quantifying defects in transmission electron microscopy (TEM) images, focusing on detecting irradiation-induced cavities in TEM images of metal alloys. We developed a random forest regression model that predicts the object detection <em>F</em><small><sub>1</sub></small> score, a statistical metric used to evaluate the ability to accurately locate and classify objects of interest. The random forest model uses features extracted from the predictions of the object detection model whose uncertainty is being quantified, enabling fast prediction on new, unlabeled images. The mean absolute error (MAE) for predicting <em>F</em><small><sub>1</sub></small> of the trained model on test data is 0.09, and the <em>R</em><small><sup>2</sup></small> score is 0.77, indicating there is a significant correlation between the random forest regression model predicted and true defect detection <em>F</em><small><sub>1</sub></small> scores. The approach is shown to be robust across three distinct TEM image datasets with varying imaging and material domains. Our approach enables users to estimate the reliability of a defect detection and segmentation model predictions and assess the applicability of the model to their specific datasets, providing valuable information about possible domain shifts and whether the model needs to be fine-tuned or trained on additional data to be maximally effective for the desired use case.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 987-997"},"PeriodicalIF":6.2,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00351a?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADEL: an automated drop-cast electrode setup for high-throughput screening of battery materials† ADEL:用于高通量筛选电池材料的自动滴铸电极装置†
IF 6.2
Digital discovery Pub Date : 2025-02-28 DOI: 10.1039/D4DD00381K
Maha Ismail, Maria Angeles Cabañero, Joseba Orive, Lakshmipriya Musuvadhi Babulal, Javier Garcia, Maria C. Morant-Miñana, Jean-Luc Dauvergne, Francisco Bonilla, Iciar Monterrubio, Javier Carrasco, Amaia Saracibar and Marine Reynaud
{"title":"ADEL: an automated drop-cast electrode setup for high-throughput screening of battery materials†","authors":"Maha Ismail, Maria Angeles Cabañero, Joseba Orive, Lakshmipriya Musuvadhi Babulal, Javier Garcia, Maria C. Morant-Miñana, Jean-Luc Dauvergne, Francisco Bonilla, Iciar Monterrubio, Javier Carrasco, Amaia Saracibar and Marine Reynaud","doi":"10.1039/D4DD00381K","DOIUrl":"https://doi.org/10.1039/D4DD00381K","url":null,"abstract":"<p >Screening electrode materials in conventional battery research is time-consuming due to the lengthy and intricate preparation process, where multiple parameters directly influence electrochemical performance. In this work, we present ADEL, an affordable module for the Automated preparation of high-loading Drop-cast ELectrodes, integrated within MAITENA, a Materials Acceleration and Innovation plaTform for ENergy Applications. The process consists of two main steps: (i) the automated preparation of electrode slurries and (ii) the drop-casting of these slurries onto aluminum foils using a pipetting robot, followed by drying under a halogen lamp. ADEL enables the preparation of 48 electrodes per day, allowing for the screening of up to 24 distinct active materials and/or electrode formulations. We demonstrate the method's repeatability using various commercial and lab-synthesized battery materials in different cell configurations, consistently achieving results with less than 3% relative standard deviation. As such, ADEL provides reliable, high-quality datasets for fast screening of battery materials, significantly accelerating research and development efforts.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 943-953"},"PeriodicalIF":6.2,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00381k?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Atlas: a brain for self-driving laboratories Atlas:自动驾驶实验室的大脑
IF 6.2
Digital discovery Pub Date : 2025-02-26 DOI: 10.1039/D4DD00115J
Riley J. Hickman, Malcolm Sim, Sergio Pablo-García, Gary Tom, Ivan Woolhouse, Han Hao, Zeqing Bao, Pauric Bannigan, Christine Allen, Matteo Aldeghi and Alán Aspuru-Guzik
{"title":"Atlas: a brain for self-driving laboratories","authors":"Riley J. Hickman, Malcolm Sim, Sergio Pablo-García, Gary Tom, Ivan Woolhouse, Han Hao, Zeqing Bao, Pauric Bannigan, Christine Allen, Matteo Aldeghi and Alán Aspuru-Guzik","doi":"10.1039/D4DD00115J","DOIUrl":"https://doi.org/10.1039/D4DD00115J","url":null,"abstract":"<p >Self-driving laboratories (SDLs) are next-generation research and development platforms for closed-loop, autonomous experimentation that combine ideas from artificial intelligence, robotics, and high-performance computing. A critical component of SDLs is the decision-making algorithm used to prioritize experiments to be performed. This SDL “brain” often relies on optimization strategies that are guided by machine learning models, such as Bayesian optimization. However, the diversity of hardware constraints and scientific questions being tackled by SDLs require the availability of a set of flexible algorithms that have yet to be implemented in a single software tool. Here, we report Atlas, an application-agnostic Python library for Bayesian optimization that is specifically tailored to the needs of SDLs. Atlas provides facile access to state-of-the-art, model-based optimization algorithms—including mixed-parameter, multi-objective, constrained, robust, multi-fidelity, meta-learning, asynchronous, and molecular optimization—as an all-in-one tool that is expected to suit the majority of specialized SDL needs. After a brief description of its core capabilities, we demonstrate Atlas' utility by optimizing the oxidation potential of metal complexes with an autonomous electrochemical experimentation platform. We expect Atlas to expand the breadth of design and discovery problems in the natural sciences that are immediately addressable with SDLs.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 1006-1029"},"PeriodicalIF":6.2,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00115j?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-throughput robotic collection, imaging, and machine learning analysis of salt patterns: composition and concentration from dried droplet photos† 高通量机器人收集,成像和盐模式的机器学习分析:干燥液滴照片的组成和浓度†
IF 6.2
Digital discovery Pub Date : 2025-02-26 DOI: 10.1039/D4DD00333K
Bruno C. Batista, Amrutha S. V., Jie Yan, Beni B. Dangi and Oliver Steinbock
{"title":"High-throughput robotic collection, imaging, and machine learning analysis of salt patterns: composition and concentration from dried droplet photos†","authors":"Bruno C. Batista, Amrutha S. V., Jie Yan, Beni B. Dangi and Oliver Steinbock","doi":"10.1039/D4DD00333K","DOIUrl":"https://doi.org/10.1039/D4DD00333K","url":null,"abstract":"<p >Macroscopic deposit patterns resulting from dried solutions and dispersions are often perceived as random and without meaningful information. Their formation is governed by a bewildering interplay of evaporation, crystal nucleation and growth, capillary flows, Marangoni convection, diffusion, and heat exchange that severely hinders mechanistic studies. It is therefore remarkable that the patterns contain subtle clues about the chemical nature of the original solution. To utilize this information, extensive reference image libraries and advanced analysis methods are essential. For this purpose, we developed a robotic drop imager (RODI) that, under non-stop operation, produces up to 2500 high-resolution images of sample deposits daily. Utilizing RODI, we have assembled an initial library of 23 417 images for seven inorganic salts and five concentration levels. Each image is analyzed and distilled into 47 metric values that capture distinct characteristics of the deposit patterns. This compact dataset is utilized for machine learning and artificial intelligence training, specifically with Random Forest, XGBoost, and a deep learning multi-layer perceptron. We achieved prediction accuracies of 98.7% for the salt type and 92.2% for the combined salt type and initial concentration. Expanded databases will likely enable the rapid identification of broad compositional features from mere photographic images, with possible applications ranging from phone-based apps to field-based analytical and lab safety tools.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 1030-1041"},"PeriodicalIF":6.2,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00333k?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling CO2 reactivity with data-driven methods† 揭示二氧化碳反应与数据驱动的方法†
IF 6.2
Digital discovery Pub Date : 2025-02-26 DOI: 10.1039/D5DD00020C
Maike Eckhoff, Kerstin L. Bublitz and Jonny Proppe
{"title":"Unveiling CO2 reactivity with data-driven methods†","authors":"Maike Eckhoff, Kerstin L. Bublitz and Jonny Proppe","doi":"10.1039/D5DD00020C","DOIUrl":"https://doi.org/10.1039/D5DD00020C","url":null,"abstract":"<p >Carbon dioxide is a versatile C1 building block in organic synthesis. Understanding its reactivity is crucial for predicting reaction outcomes and identifying suitable substrates for the creation of value-added chemicals and drugs. A recent study [Li <em>et al.</em>, <em>J. Am. Chem. Soc.</em>, 2020, <strong>142</strong>, 8383] estimated the reactivity of CO<small><sub>2</sub></small> in the form of Mayr's electrophilicity parameter <em>E</em> on the basis of a single carboxylation reaction. The disagreement between experiment (<em>E</em> = −16.3) and computation (<em>E</em> = −11.4) corresponds to a deviation of up to ten orders of magnitude in bimolecular rate constants of carboxylation reactions according to the Mayr–Patz equation, log <em>k</em> = <em>s</em><small><sub>N</sub></small>(<em>E</em> + <em>N</em>). Here, we introduce a data-driven approach incorporating supervised learning, quantum chemistry, and uncertainty quantification to resolve this discrepancy. The dataset used for reducing the uncertainty in <em>E</em>(CO<small><sub>2</sub></small>) represents 15 carboxylation reactions in DMSO. However, experimental data is only available for one of these reactions. To ensure reliable predictions, we selected a training set composed of this and 19 additional reactions comprising heteroallenes other than CO<small><sub>2</sub></small> for which experimental data is available. With the new data-driven protocol, we can narrow down the electrophilicity of carbon dioxide to <em>E</em>(CO<small><sub>2</sub></small>) = −14.6(5) with 95% confidence, and suggest an electrophile-specific sensitivity parameter <em>s</em><small><sub>E</sub></small>(CO<small><sub>2</sub></small>) = 0.81(6), resulting in an extended reactivity equation, log <em>k</em> = <em>s</em><small><sub>E</sub></small><em>s</em><small><sub>N</sub></small>(<em>E</em> + <em>N</em>) [Mayr, <em>Tetrahedron</em>, 2015, <strong>71</strong>, 5095].</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 868-878"},"PeriodicalIF":6.2,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00020c?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating stochastic antibody–drug conjugation: a self-driving lab approach for enhanced therapeutic development† 自动化随机抗体-药物偶联:一种增强治疗发展的自驾车实验室方法
IF 6.2
Digital discovery Pub Date : 2025-02-24 DOI: 10.1039/D4DD00363B
Liam Roberts, Matthew E. Reish, Jerrica Yang, Wenyu Zhang, Joshua S. Derasp and Jason E. Hein
{"title":"Automating stochastic antibody–drug conjugation: a self-driving lab approach for enhanced therapeutic development†","authors":"Liam Roberts, Matthew E. Reish, Jerrica Yang, Wenyu Zhang, Joshua S. Derasp and Jason E. Hein","doi":"10.1039/D4DD00363B","DOIUrl":"https://doi.org/10.1039/D4DD00363B","url":null,"abstract":"<p >Antibody–drug conjugates (ADCs) have become a promising cancer treatment over the past two decades due to their on-target drug-release capabilities. However, labor-intensive manual conjugations currently limit the throughput of ADC synthesis. Herein, we introduce a Self-Driving Lab (SDL) for automated stochastic antibody–drug conjugation and characterization. The robotic platform performs conjugations and determines drug to antibody ratios from chromatography data, enabling the production of target ADCs iteratively in a closed loop. Our SDL establishes a robust foundation for increasing ADC production throughput and accelerating the development of cancer therapeutics.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 979-986"},"PeriodicalIF":6.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00363b?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auto-generating question-answering datasets with domain-specific knowledge for language models in scientific tasks† 自动生成问答数据集与领域特定知识的语言模型在科学任务†
IF 6.2
Digital discovery Pub Date : 2025-02-24 DOI: 10.1039/D4DD00307A
Zongqian Li and Jacqueline M. Cole
{"title":"Auto-generating question-answering datasets with domain-specific knowledge for language models in scientific tasks†","authors":"Zongqian Li and Jacqueline M. Cole","doi":"10.1039/D4DD00307A","DOIUrl":"https://doi.org/10.1039/D4DD00307A","url":null,"abstract":"<p >Large language models (LLMs) have emerged as a useful tool for the public to process and respond to a vast range of interactive text-based queries. While foundational LLMs are well suited to making general user queries, smaller language models that have been trained on custom text from a specific domain of interest tend to display superior performance on queries about that domain, can operate faster and improve efficiency. Nonetheless, considerable resources are still needed to pre-train a language model with custom data. We present a pipeline that shows a way to overcome this need for pre-training. The pipeline first uses new algorithms that we have designed to produce a large, high-quality question-answering dataset (SCQA) for a particular domain of interest, solar cells. These algorithms employed a solar-cell database that had been auto-generated using the ‘chemistry-aware’ natural language processing tool, ChemDataExtractor. In turn, this SCQA dataset is used to fine-tune language models, whose resulting <em>F</em><small><sub>1</sub></small>-scores of performance far exceed (by 10–20%) those of analogous language models that have been fine-tuned against a general-English language QA dataset, SQuAD. Importantly, the performance of the language models fine-tuned against the SCQA dataset does not depend on the size of their architecture, whether or not the tokens were cased or uncased or whether or not the foundational language models were further pre-trained with domain-specific data or fine-tuned directly from their vanilla state. This shows that this domain-specific SCQA dataset produced by our algorithms has sufficient intrinsic domain knowledge to be directly fine-tuned against a foundational language model for immediate use with improved performance.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 998-1005"},"PeriodicalIF":6.2,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00307a?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SANE: strategic autonomous non-smooth exploration for multiple optima discovery in multi-modal and non-differentiable black-box functions† 多模态不可微黑盒函数中多最优解发现的策略自主非平滑探索
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00299G
Arpan Biswas, Rama Vasudevan, Rohit Pant, Ichiro Takeuchi, Hiroshi Funakubo and Yongtao Liu
{"title":"SANE: strategic autonomous non-smooth exploration for multiple optima discovery in multi-modal and non-differentiable black-box functions†","authors":"Arpan Biswas, Rama Vasudevan, Rohit Pant, Ichiro Takeuchi, Hiroshi Funakubo and Yongtao Liu","doi":"10.1039/D4DD00299G","DOIUrl":"https://doi.org/10.1039/D4DD00299G","url":null,"abstract":"<p >Both computational and experimental material discovery bring forth the challenge of exploring multidimensional and multimodal parameter spaces, such as phase diagrams of Hamiltonians with multiple interactions, composition spaces of combinatorial libraries, material structure image spaces, and molecular embedding spaces. Often these systems are black-boxes and time-consuming to evaluate, which resulted in strong interest towards active learning methods such as Bayesian optimization (BO). However, these systems are often noisy which make the black box function severely multi-modal and non-differentiable, where a vanilla BO can get overly focused near a single or faux optimum, deviating from the broader goal of scientific discovery. To address these limitations, here we developed Strategic Autonomous Non-Smooth Exploration (SANE) to facilitate an intelligent Bayesian optimized navigation with a proposed cost-driven probabilistic acquisition function to find multiple global and local optimal regions, avoiding the tendency to becoming trapped in a single optimum. To distinguish between a true and false optimal region due to noisy experimental measurements, a human (domain) knowledge driven dynamic surrogate gate is integrated with SANE. We implemented the gate-SANE into pre-acquired piezoresponse spectroscopy data of a ferroelectric combinatorial library with high noise levels in specific regions, and piezoresponse force microscopy (PFM) hyperspectral data. SANE demonstrated better performance than classical BO to facilitate the exploration of multiple optimal regions and thereby prioritized learning with higher coverage of scientific values in autonomous experiments. Our work showcases the potential application of this method to real-world experiments, where such combined strategic and human intervening approaches can be critical to unlocking new discoveries in autonomous research.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 853-867"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00299g?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dissecting errors in machine learning for retrosynthesis: a granular metric framework and a transformer-based model for more informative predictions 反合成机器学习中的剖析错误:一个粒度度量框架和一个基于变压器的模型,用于提供更多信息的预测
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00263F
Arihanth Srikar Tadanki, H. Surya Prakash Rao and U. Deva Priyakumar
{"title":"Dissecting errors in machine learning for retrosynthesis: a granular metric framework and a transformer-based model for more informative predictions","authors":"Arihanth Srikar Tadanki, H. Surya Prakash Rao and U. Deva Priyakumar","doi":"10.1039/D4DD00263F","DOIUrl":"https://doi.org/10.1039/D4DD00263F","url":null,"abstract":"<p >Chemical reaction prediction, encompassing forward synthesis and retrosynthesis, stands as a fundamental challenge in organic synthesis. A widely adopted computational approach frames synthesis prediction as a sequence-to-sequence translation task, using the commonly used SMILES representation for molecules. The current evaluation of machine learning methods for retrosynthesis assumes perfect training data, overlooking imperfections in reaction equations in popular datasets, such as missing reactants, products, other physical and practical constraints such as temperature and cost, primarily due to a focus on the target molecule. This limitation leads to an incomplete representation of viable synthetic routes, especially when multiple sets of reactants can yield a given desired product. In response to these shortcomings, this study examines the prevailing evaluation methods and introduces comprehensive metrics designed to address imperfections in the dataset. Our novel metrics not only assess absolute accuracy by comparing predicted outputs with ground truth but also introduce a nuanced evaluation approach. We provide scores for partial correctness and compute adjusted accuracy through graph matching, acknowledging the inherent complexities of retrosynthetic pathways. Additionally, we explore the impact of small molecular augmentations while preserving chemical properties and employ similarity matching to enhance the assessment of prediction quality. We introduce SynFormer, a sequence-to-sequence model tailored for SMILES representation. It incorporates architectural enhancements to the original transformer, effectively tackling the challenges of chemical reaction prediction. SynFormer achieves a Top-1 accuracy of 53.2% on the USPTO-50k dataset, matching the performance of widely accepted models like Chemformer, but with greater efficiency by eliminating the need for pre-training.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 3","pages":" 831-845"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00263f?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital workflow optimization of van der Waals methods for improved halide perovskite solar materials† 改进卤化物钙钛矿太阳能材料的范德华方法的数字工作流程优化
IF 6.2
Digital discovery Pub Date : 2025-02-18 DOI: 10.1039/D4DD00312H
Celso R. C. Rêgo, Wolfgang Wenzel, Maurício J. Piotrowski, Alexandre C. Dias, Carlos Maciel de Oliveira Bastos, Luis O. de Araujo and Diego Guedes-Sobrinho
{"title":"Digital workflow optimization of van der Waals methods for improved halide perovskite solar materials†","authors":"Celso R. C. Rêgo, Wolfgang Wenzel, Maurício J. Piotrowski, Alexandre C. Dias, Carlos Maciel de Oliveira Bastos, Luis O. de Araujo and Diego Guedes-Sobrinho","doi":"10.1039/D4DD00312H","DOIUrl":"https://doi.org/10.1039/D4DD00312H","url":null,"abstract":"<p >Hybrid organic–inorganic metal halide perovskites are low-cost and highly efficient materials used in solar cell devices. However, the intricacies of perovskites that merge organic cations with inorganic frameworks necessitate further elucidation, particularly from the long-range van der Waals perspective. Here, we scrutinize the van der Waals (vdW) methods by conceptualizing organic cations for XH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> and CH<small><sub>3</sub></small>XH<small><sub>3</sub></small>PbI<small><sub>3</sub></small> prototype perovskites (X = N, P, As, and Sb), to investigate the thermodynamic stability. To handle the enormous amount of raw data generated from DFT + vdW + SOC with DFT-1/2 (quasi-particle correction method), we have used the SimStack workflow framework, which enhanced the efficiency, reproducibility, and data transferability. The results reveal the critical role of the organic cations, inferred from ionic radius estimates and documented electronegativity, in elucidating the accommodation of symmetric XH<small><sub>4</sub></small><small><sup>+</sup></small> or asymmetric CH<small><sub>3</sub></small>XH<small><sub>3</sub></small><small><sup>+</sup></small> cations within the limited volumes of cuboctahedral cavities. The discrepancy in the ionic size within the XH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> (CH<small><sub>3</sub></small>XH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>) group positions NH<small><sub>4</sub></small>PbI<small><sub>3</sub></small> (CH<small><sub>3</sub></small>NH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>) outside (within) the stable perovskite region suggests the theoretical viability of perovskites containing phosphonium, arsonium, and stibonium beyond CH<small><sub>3</sub></small>NH<small><sub>3</sub></small>PbI<small><sub>3</sub></small>. As we move from N to Sb, the organic cation's properties, such as ionic radius and electronegativity, affect the thermodynamic stability and local geometry of octahedra, directly influencing the band gaps.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 4","pages":" 927-942"},"PeriodicalIF":6.2,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d4dd00312h?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信