荧光预测器:用于荧光染料多属性预测和检索的可解释工具

IF 5.3 2区 化学 Q1 CHEMISTRY, MEDICINAL
Wenxiang Song, Le Xiong, Xinmin Li, Yuyang Zhang, Binya Wang, Guixia Liu, Weihua Li, Youjun Yang* and Yun Tang*, 
{"title":"荧光预测器:用于荧光染料多属性预测和检索的可解释工具","authors":"Wenxiang Song,&nbsp;Le Xiong,&nbsp;Xinmin Li,&nbsp;Yuyang Zhang,&nbsp;Binya Wang,&nbsp;Guixia Liu,&nbsp;Weihua Li,&nbsp;Youjun Yang* and Yun Tang*,&nbsp;","doi":"10.1021/acs.jcim.5c0012710.1021/acs.jcim.5c00127","DOIUrl":null,"url":null,"abstract":"<p >With the rapid advancements in the field of fluorescent dyes, accurate prediction of optical properties and efficient retrieval of dye-related data are essential for effective dye design. However, there is a lack of tools for comprehensive data integration and convenient data retrieval. Moreover, existing prediction models mainly focus on a single property of fluorescent dyes and fail to account for the diverse fluorophores and solutions in a systematic manner. To address this, we proposed Fluor-predictor, a multitask prediction model for fluorophores. This study integrates multiple dye databases and develops an interpretable graph neural network-based multitask regression model to predict four key optical properties of fluorescent dyes. We thoroughly examined the impact of factors such as data quality and the number of solvents on model performance. By leveraging atomic weight contributions, the model not only predicts these properties but also provides insights to guide structural modifications. In addition, we compiled and built a comprehensive database containing 36,756 records of fluorescence properties. To address the limitations of existing models in accurate prediction of Xanthene and Cyanine dyes, we then compiled 1148 Xanthene dye records and 1496 Cyanine dye records from the literature, comparing direct training with transfer learning approaches. The model achieved mean absolute errors (MAE) of 11.70 nm, 15.37 nm, 0.096, and 0.091 for predicting absorption wavelength (λ<sub>abs</sub>), emission wavelength (λ<sub>em</sub>), quantum yield (Φ) and molar extinction coefficient (Log(ε)), respectively. We integrated this work into a tool, Fluor-predictor, which supports comprehensive retrieval methods and multiproperty prediction. Fluor-predictor will facilitate data retrieval, prescreening, and structural modification of dyes.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"65 6","pages":"2854–2867 2854–2867"},"PeriodicalIF":5.3000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fluor-Predictor: An Interpretable Tool for Multiproperty Prediction and Retrieval of Fluorescent Dyes\",\"authors\":\"Wenxiang Song,&nbsp;Le Xiong,&nbsp;Xinmin Li,&nbsp;Yuyang Zhang,&nbsp;Binya Wang,&nbsp;Guixia Liu,&nbsp;Weihua Li,&nbsp;Youjun Yang* and Yun Tang*,&nbsp;\",\"doi\":\"10.1021/acs.jcim.5c0012710.1021/acs.jcim.5c00127\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >With the rapid advancements in the field of fluorescent dyes, accurate prediction of optical properties and efficient retrieval of dye-related data are essential for effective dye design. However, there is a lack of tools for comprehensive data integration and convenient data retrieval. Moreover, existing prediction models mainly focus on a single property of fluorescent dyes and fail to account for the diverse fluorophores and solutions in a systematic manner. To address this, we proposed Fluor-predictor, a multitask prediction model for fluorophores. This study integrates multiple dye databases and develops an interpretable graph neural network-based multitask regression model to predict four key optical properties of fluorescent dyes. We thoroughly examined the impact of factors such as data quality and the number of solvents on model performance. By leveraging atomic weight contributions, the model not only predicts these properties but also provides insights to guide structural modifications. In addition, we compiled and built a comprehensive database containing 36,756 records of fluorescence properties. To address the limitations of existing models in accurate prediction of Xanthene and Cyanine dyes, we then compiled 1148 Xanthene dye records and 1496 Cyanine dye records from the literature, comparing direct training with transfer learning approaches. The model achieved mean absolute errors (MAE) of 11.70 nm, 15.37 nm, 0.096, and 0.091 for predicting absorption wavelength (λ<sub>abs</sub>), emission wavelength (λ<sub>em</sub>), quantum yield (Φ) and molar extinction coefficient (Log(ε)), respectively. We integrated this work into a tool, Fluor-predictor, which supports comprehensive retrieval methods and multiproperty prediction. Fluor-predictor will facilitate data retrieval, prescreening, and structural modification of dyes.</p>\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":\"65 6\",\"pages\":\"2854–2867 2854–2867\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jcim.5c00127\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jcim.5c00127","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

摘要

随着荧光染料领域的快速发展,染料光学性质的准确预测和染料相关数据的有效检索是有效染料设计的必要条件。然而,缺乏全面的数据集成和方便的数据检索工具。此外,现有的预测模型主要关注荧光染料的单一性质,不能系统地考虑各种荧光团和溶液。为了解决这个问题,我们提出了一个荧光团的多任务预测模型——Fluor-predictor。本研究整合了多个染料数据库,开发了一个基于可解释图神经网络的多任务回归模型来预测荧光染料的四个关键光学性质。我们彻底检查了数据质量和溶剂数量等因素对模型性能的影响。通过利用原子量的贡献,该模型不仅可以预测这些属性,还可以提供指导结构修改的见解。此外,我们还编制并建立了一个包含36,756条荧光性质记录的综合数据库。为了解决现有模型在准确预测Xanthene和Cyanine染料方面的局限性,我们从文献中收集了1148条Xanthene染料记录和1496条Cyanine染料记录,比较了直接训练和迁移学习方法。该模型预测吸收波长(λabs)、发射波长(λem)、量子产率(Φ)和摩尔消光系数(Log(ε))的平均绝对误差(MAE)分别为11.70 nm、15.37 nm、0.096和0.091。我们将这项工作集成到一个工具中,Fluor-predictor,它支持综合检索方法和多属性预测。氟预测器将促进数据检索,预筛选和染料结构修改。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Fluor-Predictor: An Interpretable Tool for Multiproperty Prediction and Retrieval of Fluorescent Dyes

Fluor-Predictor: An Interpretable Tool for Multiproperty Prediction and Retrieval of Fluorescent Dyes

With the rapid advancements in the field of fluorescent dyes, accurate prediction of optical properties and efficient retrieval of dye-related data are essential for effective dye design. However, there is a lack of tools for comprehensive data integration and convenient data retrieval. Moreover, existing prediction models mainly focus on a single property of fluorescent dyes and fail to account for the diverse fluorophores and solutions in a systematic manner. To address this, we proposed Fluor-predictor, a multitask prediction model for fluorophores. This study integrates multiple dye databases and develops an interpretable graph neural network-based multitask regression model to predict four key optical properties of fluorescent dyes. We thoroughly examined the impact of factors such as data quality and the number of solvents on model performance. By leveraging atomic weight contributions, the model not only predicts these properties but also provides insights to guide structural modifications. In addition, we compiled and built a comprehensive database containing 36,756 records of fluorescence properties. To address the limitations of existing models in accurate prediction of Xanthene and Cyanine dyes, we then compiled 1148 Xanthene dye records and 1496 Cyanine dye records from the literature, comparing direct training with transfer learning approaches. The model achieved mean absolute errors (MAE) of 11.70 nm, 15.37 nm, 0.096, and 0.091 for predicting absorption wavelength (λabs), emission wavelength (λem), quantum yield (Φ) and molar extinction coefficient (Log(ε)), respectively. We integrated this work into a tool, Fluor-predictor, which supports comprehensive retrieval methods and multiproperty prediction. Fluor-predictor will facilitate data retrieval, prescreening, and structural modification of dyes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信