Digital discovery最新文献_第3页

Advancing vanadium redox flow battery analysis: a deep learning approach for high-throughput 3D visualization and bubble quantification 推进钒氧化还原液流电池分析：高通量3D可视化和气泡量化的深度学习方法

IF 6.2

Digital discovery Pub Date : 2025-08-19 DOI: 10.1039/D5DD00158G

André Colliard-Granero, Kangjun Duan, Roswitha Zeis, Michael H. Eikerling, Kourosh Malek and Mohammad J. Eslamibidgoli

{"title":"Advancing vanadium redox flow battery analysis: a deep learning approach for high-throughput 3D visualization and bubble quantification","authors":"André Colliard-Granero, Kangjun Duan, Roswitha Zeis, Michael H. Eikerling, Kourosh Malek and Mohammad J. Eslamibidgoli","doi":"10.1039/D5DD00158G","DOIUrl":"https://doi.org/10.1039/D5DD00158G","url":null,"abstract":"This work harnesses deep learning to expedite analyses of research data for vanadium redox flow batteries. Recent studies have highlighted the significance of analyzing bubbles within vanadium redox flow batteries. The investigation of these bubbles had remained elusive in direct imaging until advancements in cell design facilitated their observation through synchrotron X-ray tomography. Yet, the considerable volume of slices per tomograph and the complexity of the features present challenges for analyzing bubbles. To tackle this issue, we propose a deep learning-based framework that allows experimentalists to conduct high-throughput analyses based on synchrotron X-ray tomographic images of vanadium redox flow batteries. We conducted a benchmarking study on various U-Net configurations using a dataset that includes three complete volumes. These volumes represent different cell configurations and encompass 2294 annotated images. Through a multi-class semantic segmentation approach, we aimed to identify four distinct classes, such as bubbles, electrolytes, membranes, and gaskets. The optimal model achieved a precision of 98%, a recall of 97%, and an F1-score of 97% on the validation set. Following segmentation, the framework facilitates rapid differentiation of electrodes, quantification of bubble volume, individual bubble shape analysis, generation of 2D bubble density maps, and calculation of membrane blockage. All results are readily accessible for interactive, on-site visualization within a 3D environment. The openly available software allows users to engage with the data in a comprehensive and intuitive manner. For access, please visit the following GitHub repository: https://github.com/andyco98/UTILE-Redox.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2724-2736"},"PeriodicalIF":6.2,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00158g?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145236712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automating care by self-maintainability for full laboratory automation 通过自我维护自动化护理，实现实验室自动化

IF 6.2

Digital discovery Pub Date : 2025-08-19 DOI: 10.1039/D5DD00151J

Koji Ochiai, Yuya Tahara-Arai, Akari Kato, Kazunari Kaizu, Hirokazu Kariyazaki, Makoto Umeno, Koichi Takahashi, Genki N. Kanda and Haruka Ozaki

{"title":"Automating care by self-maintainability for full laboratory automation","authors":"Koji Ochiai, Yuya Tahara-Arai, Akari Kato, Kazunari Kaizu, Hirokazu Kariyazaki, Makoto Umeno, Koichi Takahashi, Genki N. Kanda and Haruka Ozaki","doi":"10.1039/D5DD00151J","DOIUrl":"https://doi.org/10.1039/D5DD00151J","url":null,"abstract":"The automation of experiments in life sciences and chemistry has significantly advanced with the development of various instruments and artificial intelligence (AI) technologies. However, achieving full laboratory automation, where experiments conceived by scientists are seamlessly executed in automated laboratories, remains a challenge. We identify the lack of automation in planning and operational tasks—critical human-managed processes collectively termed “care”—as a major barrier. Automating care is the key enabler for full laboratory automation. To address this, we propose the concept of self-maintainability (SeM): the ability of a laboratory system to autonomously adapt to internal and external disturbances, maintaining operational readiness. This ability is inspired by the homeostasis, resilience, autonomous state recognition, and adaptability seen in living cells. A SeM-enabled laboratory features autonomous recognition of its state, dynamic resource and information management, and adaptive responses to unexpected conditions. This shifts the planning and execution of experimental workflows, including scheduling and reagent allocation, from humans to the system. We present a conceptual framework for implementing SeM-enabled laboratories, comprising three modules—Requirement manager, Labware manager, and Device manager—and a Central manager. A laboratory design that is aware of SeM not only enables scientists to execute envisioned experiments seamlessly but also provides developers with a design concept that drives the technological innovations needed for full automation.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 9","pages":" 2285-2297"},"PeriodicalIF":6.2,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00151j?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145028071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction: Atomate2: modular workflows for materials science 更正：Atomate2：材料科学的模块化工作流程。

IF 6.2

Digital discovery Pub Date : 2025-08-18 DOI: 10.1039/D5DD90036K

Alex M. Ganose, Hrushikesh Sahasrabuddhe, Mark Asta, Kevin Beck, Tathagata Biswas, Alexander Bonkowski, Joana Bustamante, Xin Chen, Yuan Chiang, Daryl C. Chrzan, Jacob Clary, Orion A. Cohen, Christina Ertural, Max C. Gallant, Janine George, Sophie Gerits, Rhys E. A. Goodall, Rishabh D. Guha, Geoffroy Hautier, Matthew Horton, T. J. Inizan, Aaron D. Kaplan, Ryan S. Kingsbury, Matthew C. Kuner, Bryant Li, Xavier Linn, Matthew J. McDermott, Rohith Srinivaas Mohanakrishnan, Aakash A. Naik, Jeffrey B. Neaton, Shehan M. Parmar, Kristin A. Persson, Guido Petretto, Thomas A. R. Purcell, Francesco Ricci, Benjamin Rich, Janosh Riebesell, Gian-Marco Rignanese, Andrew S. Rosen, Matthias Scheffler, Jonathan Schmidt, Jimmy-Xuan Shen, Andrei Sobolev, Ravishankar Sundararaman, Cooper Tezak, Victor Trinquet, Joel B. Varley, Derek Vigil-Fowler, Duo Wang, David Waroquiers, Mingjian Wen, Han Yang, Hui Zheng, Jiongzhi Zheng, Zhuoying Zhu and Anubhav Jain

引用次数: 0

GP-MoLFormer: a foundation model for molecular generation GP-MoLFormer：分子生成的基础模型

IF 6.2

Digital discovery Pub Date : 2025-08-18 DOI: 10.1039/D5DD00122F

Jerret Ross, Brian Belgodere, Samuel C. Hoffman, Vijil Chenthamarakshan, Jiri Navratil, Youssef Mroueh and Payel Das

{"title":"GP-MoLFormer: a foundation model for molecular generation","authors":"Jerret Ross, Brian Belgodere, Samuel C. Hoffman, Vijil Chenthamarakshan, Jiri Navratil, Youssef Mroueh and Payel Das","doi":"10.1039/D5DD00122F","DOIUrl":"https://doi.org/10.1039/D5DD00122F","url":null,"abstract":"Transformer-based models trained on large and general purpose datasets consisting of molecular strings have recently emerged as a powerful tool for successfully modeling various structure–property relations. Inspired by this success, we extend the paradigm of training chemical language transformers on large-scale chemical datasets to generative tasks in this work. Specifically, we propose GP-MoLFormer, an autoregressive molecular string generator that is trained on more than 1.1b (billion) chemical SMILES. GP-MoLFormer uses a 46.8m parameter transformer decoder model with linear attention and rotary positional encodings as the base architecture. GP-MoLFormer's utility is evaluated and compared with that of existing baselines on three different tasks: de novo generation, scaffold-constrained molecular decoration, and unconstrained property-guided optimization. While the first two are handled with no additional training, we propose a parameter-efficient fine-tuning method for the last task, which uses property-ordered molecular pairs as input. We call this new approach pair-tuning. Our results show GP-MoLFormer performs better or comparable with baselines across all three tasks, while producing molecules with higher diversity demonstrating its general utility for a variety of molecular generation tasks. We further report strong memorization of training data in GP-MoLFormer generations, which has so far remained unexplored for chemical language models. Our analyses reveal that training data memorization and novelty in generations are impacted by the quality and scale of the training data; duplication bias in training data can enhance memorization at the cost of lowering novelty. We further establish a scaling law relating inference compute and novelty in generations, and show that the proposed model excels at yielding molecules containing unique scaffolds while generating at ≈106 to 109 scale.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2684-2696"},"PeriodicalIF":6.2,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00122f?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145236709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Going beyond SMILES enumeration for data augmentation in generative drug discovery 超越SMILES枚举在生成药物发现中的数据增强。

IF 6.2

Digital discovery Pub Date : 2025-08-14 DOI: 10.1039/D5DD00028A

Helena Brinkmann, Antoine Argante, Hugo ter Steege and Francesca Grisoni

{"title":"Going beyond SMILES enumeration for data augmentation in generative drug discovery","authors":"Helena Brinkmann, Antoine Argante, Hugo ter Steege and Francesca Grisoni","doi":"10.1039/D5DD00028A","DOIUrl":"10.1039/D5DD00028A","url":null,"abstract":"Data augmentation can alleviate the limitations of small molecular datasets for generative deep learning by ‘artificially inflating’ the number of instances available for training. SMILES enumeration – wherein multiple valid SMILES strings are used to represent the same molecules – has become particularly beneficial to improve the quality of de novo molecule design. Herein, we investigated whether rethinking SMILES augmentation techniques could further enhance the quality of de novo design. To this end, we introduce four novel approaches for SMILES augmentation, drawing inspiration from natural language processing and chemistry insights: (a) token deletion, (b) atom masking, (c) bioisosteric substitution, and (d) self-training. Via systematic analysis, our results showed the promise of considering additional strategies for SMILES augmentation. Every strategy showed distinct advantages; for example, atom masking is particularly promising to learn desirable physico-chemical properties in very low-data regimes, and deletion to create novel scaffolds. This new repertoire of SMILES augmentation strategies expands the available toolkit to design molecules with bespoke properties in low-data scenarios.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2752-2764"},"PeriodicalIF":6.2,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12409607/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development of synthetic chloride transporters using high-throughput screening and machine learning 利用高通量筛选和机器学习开发合成氯离子转运体。

IF 6.2

Digital discovery Pub Date : 2025-08-13 DOI: 10.1039/D5DD00140D

Surid Mohammad Chowdhury, Nada J. Daood, Katherine R. Lewis, Rayhanus Salam, Hao Zhu and Nathalie Busschaert

{"title":"Development of synthetic chloride transporters using high-throughput screening and machine learning","authors":"Surid Mohammad Chowdhury, Nada J. Daood, Katherine R. Lewis, Rayhanus Salam, Hao Zhu and Nathalie Busschaert","doi":"10.1039/D5DD00140D","DOIUrl":"10.1039/D5DD00140D","url":null,"abstract":"The development of synthetic compounds capable of transporting chloride anions across biological membranes has become an intensive research field in the last two decades. Progress is driven by the desire to develop treatments for chloride transport related diseases (e.g., cystic fibrosis), cancer or bacterial infections. In this manuscript, we use high-throughput screening and machine learning to identify novel scaffolds, and to find the molecular features needed to achieve potent chloride transport that can be generalized across diverse chemotypes. 1894 compounds were tested, 59 of which had confirmed transmembrane chloride transport ability. A machine learning (ML) binary classification model indicated that MolLog P is the most important feature to predict transport ability, but it is not sufficient by itself. The best ML model was able to identify potential chloride transporters from the DrugBank database and the predictions were experimentally validated. These insights can provide other researchers with inspiration and guidelines to develop ever more potent chloride transporters.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 9","pages":" 2615-2626"},"PeriodicalIF":6.2,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12368578/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computer vision for polymer characterisation using lasers 使用激光进行聚合物表征的计算机视觉

IF 6.2

Digital discovery Pub Date : 2025-08-13 DOI: 10.1039/D5DD00219B

Seda Uyanik, Sam Parkinson, George Killick, Biplab Dutta, Rob Clowes, Charlotte E. Boott and Andrew I. Cooper

{"title":"Computer vision for polymer characterisation using lasers","authors":"Seda Uyanik, Sam Parkinson, George Killick, Biplab Dutta, Rob Clowes, Charlotte E. Boott and Andrew I. Cooper","doi":"10.1039/D5DD00219B","DOIUrl":"https://doi.org/10.1039/D5DD00219B","url":null,"abstract":"Computer vision is a useful reaction monitoring and characterisation tool for scientists seeking to accelerate discovery processes using automation and machine learning (ML). Here we report a non-invasive laser-based method that combines computer vision and deep learning models to classify the solubility of different polymeric compounds across a range of solvents. Classifications were conducted using two to four solubility classes (soluble, soluble-colloidal, partially soluble, and insoluble), achieving high test accuracy rates ranging from 94.1% (2 classes), to 89.5% (4 classes). Using results from our solubility screening method, we also determined the Hansen Solubility Parameters (HSP) of the polymers using an optimisation algorithm. The calculated percentage Euclidean distance between the HSP values obtained from our dataset and the literature HSP values for the polymers, ranged from 11–32%. Finally, we developed the feature-wise linear modulation (FiLM) conditioned Convolutional Neural Network (CNN) regression model to estimate the size of polymeric nanoparticles between 20–440 nm and achieved a Mean Absolute Error (MAE) of 9.53 nm.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2816-2826"},"PeriodicalIF":6.2,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00219b?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145236716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Comparative analysis of search approaches to discover donor molecules for organic solar cells 寻找有机太阳能电池供体分子的搜索方法的比较分析。

IF 6.2

Digital discovery Pub Date : 2025-08-13 DOI: 10.1039/D4DD00355A

Mohammed Azzouzi, Steven Bennett, Victor Posligua, Roberto Bondesan, Martijn A. Zwijnenburg and Kim E. Jelfs

{"title":"Comparative analysis of search approaches to discover donor molecules for organic solar cells","authors":"Mohammed Azzouzi, Steven Bennett, Victor Posligua, Roberto Bondesan, Martijn A. Zwijnenburg and Kim E. Jelfs","doi":"10.1039/D4DD00355A","DOIUrl":"10.1039/D4DD00355A","url":null,"abstract":"Identifying organic molecules with desirable properties from the extensive chemical space can be challenging, particularly when property evaluation methods are time-consuming and resource-intensive. In this study, we illustrate this challenge by exploring the chemical space of large oligomers, constructed from monomeric building blocks, for potential use in organic photovoltaics (OPV). For this purpose, we developed a python package to search the chemical space using a building block approach: stk-search. We use stk-search (GitHub link: STK_search) to compare a variety of search algorithms, including those based upon Bayesian optimisation and evolutionary approaches. Initially, we evaluated and compared the performance of different search algorithms within a precomputed search space. We then extended our investigation to the vast chemical space of molecules formed of 6 building blocks (6-mers), comprising over 1014 molecules. Notably, while some algorithms show only marginal improvements over a random search approach in a relatively small, precomputed, search space, their performance in the larger chemical space is orders of magnitude better. Specifically, Bayesianoptimisation identified a thousand times more promising molecules with the desired properties compared to random search, using the same computational resources.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2781-2796"},"PeriodicalIF":6.2,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12379869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Open-source generation of sigma profiles: impact of quantum chemistry and solvation treatment on machine learning performance 开源生成sigma概况：量子化学和溶剂化处理对机器学习性能的影响

IF 6.2

Digital discovery Pub Date : 2025-08-12 DOI: 10.1039/D5DD00087D

Fathya Y. M. Salih, Dinis O. Abranches, Edward J. Maginn and Yamil J. Colón

{"title":"Open-source generation of sigma profiles: impact of quantum chemistry and solvation treatment on machine learning performance","authors":"Fathya Y. M. Salih, Dinis O. Abranches, Edward J. Maginn and Yamil J. Colón","doi":"10.1039/D5DD00087D","DOIUrl":"https://doi.org/10.1039/D5DD00087D","url":null,"abstract":"The combination of machine learning (ML) models with chemistry-related tasks requires the description of molecular structures in a machine-readable way. The nature of these so-called molecular descriptors has a direct and major impact on the performance of ML models and remains an open problem in the field. Structural descriptors like SMILES strings or molecular graphs lack size-independence and can be memory intensive. Machine-learned descriptors can be of low dimensionality and constant size but lack physical significance and human interpretability. Sigma profiles, which are unnormalized histograms of the surface charge distributions of solvated molecules, combine physical significance with low dimensionality and size-independence, making them a suitable candidate for a universal molecular descriptor. However, their widespread adoption in ML applications requires open access to sigma profile generation, which is currently not available. This work details the development of OpenSPGen – an open-source tool for generating sigma profiles. Also presented are studies on the effect of different settings on the efficacy of the generated sigma profiles at predicting thermophysical material properties when used as inputs to a Gaussian process as a simple surrogate ML model. We find that a higher level of theory does not translate to more accurate results. We also provide further recommendations for sigma profile calculation and use in ML models.","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 10","pages":" 2711-2723"},"PeriodicalIF":6.2,"publicationDate":"2025-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00087d?page=search","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145236711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Combining DeepH with HONPAS for accurate and efficient hybrid functional electronic structure calculations with ten thousand atoms 将DeepH与HONPAS相结合，精确高效地计算10000个原子的杂化功能电子结构

IF 6.2

Digital discovery Pub Date : 2025-08-11 DOI: 10.1039/D5DD00128E

Yifan Ke, Xinming Qin, Wei Hu and Jinlong Yang

引用次数: 0