High-Throughput Photocatalysis for Generating Reliable Datasets Analyzed by Machine Learning.

IF 2.2 3区 化学 Q3 CHEMISTRY, PHYSICAL
Mark Croxall, Reece Lawrence, Jiaqi Gong, M Cynthia Goh
{"title":"High-Throughput Photocatalysis for Generating Reliable Datasets Analyzed by Machine Learning.","authors":"Mark Croxall, Reece Lawrence, Jiaqi Gong, M Cynthia Goh","doi":"10.1002/cphc.202500039","DOIUrl":null,"url":null,"abstract":"<p><p>Photocatalysis is an environmentally conscious tool for removing contaminants from water. Novel photocatalytic materials are often measured on ability to degrade a small number of analytes, which may not be indicative of broader applicability. In this work, an experimental method dubbed high-throughput photocatalysis (HTP) is introduced to assay photocatalytic materials against a range of analytes in a time effective manner. HTP is modular; experimental parameters, including matrix, can be changed to fit a proposed application. The photodegradation of each analyte is attained in a consistent manner such that machine learning (ML) models can be applied to the obtained datasets. Three out of the box ML models-linear regression, random forest (RF), and neural network (NN)-are tasked with estimating the percentage removal as a function of irradiation time and molecular structure, as represented by Morgan fingerprints. Leave-out sets demonstrated that RF and NN models did not overfit the training data and reasonably estimated the degradation of unknown molecules. SHapley additive exPlanations values are utilized to correlate molecular substructures to the parent molecule's susceptibility to photocatalytic degradation. These correlations are used to generate heatmaps of estimated reactivity within molecules that corroborate reports in which dye degradation pathways were studied in detail.</p>","PeriodicalId":9819,"journal":{"name":"Chemphyschem","volume":" ","pages":"e202500039"},"PeriodicalIF":2.2000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemphyschem","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1002/cphc.202500039","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Photocatalysis is an environmentally conscious tool for removing contaminants from water. Novel photocatalytic materials are often measured on ability to degrade a small number of analytes, which may not be indicative of broader applicability. In this work, an experimental method dubbed high-throughput photocatalysis (HTP) is introduced to assay photocatalytic materials against a range of analytes in a time effective manner. HTP is modular; experimental parameters, including matrix, can be changed to fit a proposed application. The photodegradation of each analyte is attained in a consistent manner such that machine learning (ML) models can be applied to the obtained datasets. Three out of the box ML models-linear regression, random forest (RF), and neural network (NN)-are tasked with estimating the percentage removal as a function of irradiation time and molecular structure, as represented by Morgan fingerprints. Leave-out sets demonstrated that RF and NN models did not overfit the training data and reasonably estimated the degradation of unknown molecules. SHapley additive exPlanations values are utilized to correlate molecular substructures to the parent molecule's susceptibility to photocatalytic degradation. These correlations are used to generate heatmaps of estimated reactivity within molecules that corroborate reports in which dye degradation pathways were studied in detail.

高通量光催化用于生成可靠的机器学习分析数据集。
光催化是一种环保的去除水中污染物的工具。新型光催化材料通常以降解少量分析物的能力来衡量,这可能并不表明其具有更广泛的适用性。在这项工作中,一种被称为高通量光催化(HTP)的实验方法被引入到测定光催化材料对一系列分析物的时间有效的方式。http是模块化的;实验参数,包括矩阵,可以改变,以适应提出的应用。以一致的方式获得每种分析物的光降解,以便机器学习(ML)模型可以应用于获得的数据集。三种现成的机器学习模型——线性回归、随机森林(RF)和神经网络(NN)——的任务是估计作为辐照时间和分子结构函数的去除率,如摩根指纹所示。遗漏集表明,RF和NN模型没有过拟合训练数据,并且合理地估计了未知分子的降解。SHapley加性解释值用于将分子亚结构与母体分子对光催化降解的敏感性联系起来。这些相关性用于生成分子内估计反应性的热图,证实了染料降解途径被详细研究的报告。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chemphyschem
Chemphyschem 化学-物理:原子、分子和化学物理
CiteScore
4.60
自引率
3.40%
发文量
425
审稿时长
1.1 months
期刊介绍: ChemPhysChem is one of the leading chemistry/physics interdisciplinary journals (ISI Impact Factor 2018: 3.077) for physical chemistry and chemical physics. It is published on behalf of Chemistry Europe, an association of 16 European chemical societies. ChemPhysChem is an international source for important primary and critical secondary information across the whole field of physical chemistry and chemical physics. It integrates this wide and flourishing field ranging from Solid State and Soft-Matter Research, Electro- and Photochemistry, Femtochemistry and Nanotechnology, Complex Systems, Single-Molecule Research, Clusters and Colloids, Catalysis and Surface Science, Biophysics and Physical Biochemistry, Atmospheric and Environmental Chemistry, and many more topics. ChemPhysChem is peer-reviewed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信