基于分子相似性的分子设计中精确的性能预测和可靠性量化

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Youquan Xu, Zhijiang Shao, Anjan K. Tula
{"title":"基于分子相似性的分子设计中精确的性能预测和可靠性量化","authors":"Youquan Xu,&nbsp;Zhijiang Shao,&nbsp;Anjan K. Tula","doi":"10.1016/j.compchemeng.2025.109241","DOIUrl":null,"url":null,"abstract":"<div><div>A crucial step in developing high-performance chemical products is the design of their constituent molecules. Computer-aided molecular design (CAMD) has gained significant attention for its potential to accelerate and enhance this design process. The typical approach involves using machine learning models trained on existing molecular databases to predict the properties of potential molecules. From these predictions, the best candidates are selected. However, prediction errors can occur, leading to unreliability in the design and limiting the effectiveness of molecular discovery. To tackle this issue, this paper presents a novel framework for modeling molecular properties based on a similarity coefficient. This framework introduces a new formula for assessing molecular similarity. By calculating the similarity between a target molecule and those in an existing database, the framework selects the most similar molecules, creating a tailored dataset for model training. This significantly enhances the accuracy of property predictions. Additionally, a quantitative reliability index is proposed based on the similarity coefficient, which allows for more informed decision-making during the molecular selection process.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"201 ","pages":"Article 109241"},"PeriodicalIF":3.9000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate property predictions and reliability quantification in molecular design based on molecular similarity\",\"authors\":\"Youquan Xu,&nbsp;Zhijiang Shao,&nbsp;Anjan K. Tula\",\"doi\":\"10.1016/j.compchemeng.2025.109241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>A crucial step in developing high-performance chemical products is the design of their constituent molecules. Computer-aided molecular design (CAMD) has gained significant attention for its potential to accelerate and enhance this design process. The typical approach involves using machine learning models trained on existing molecular databases to predict the properties of potential molecules. From these predictions, the best candidates are selected. However, prediction errors can occur, leading to unreliability in the design and limiting the effectiveness of molecular discovery. To tackle this issue, this paper presents a novel framework for modeling molecular properties based on a similarity coefficient. This framework introduces a new formula for assessing molecular similarity. By calculating the similarity between a target molecule and those in an existing database, the framework selects the most similar molecules, creating a tailored dataset for model training. This significantly enhances the accuracy of property predictions. Additionally, a quantitative reliability index is proposed based on the similarity coefficient, which allows for more informed decision-making during the molecular selection process.</div></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"201 \",\"pages\":\"Article 109241\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135425002455\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425002455","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

开发高性能化学产品的关键一步是其组成分子的设计。计算机辅助分子设计(CAMD)因其加速和增强这一设计过程的潜力而受到广泛关注。典型的方法包括使用在现有分子数据库上训练的机器学习模型来预测潜在分子的性质。从这些预测中,选出最好的候选人。然而,预测误差可能会发生,导致设计的不可靠性和限制分子发现的有效性。为了解决这一问题,本文提出了一种基于相似系数的分子性质建模的新框架。该框架引入了一种新的分子相似性评估公式。通过计算目标分子与现有数据库中的分子之间的相似性,该框架选择最相似的分子,创建一个定制的数据集用于模型训练。这大大提高了属性预测的准确性。此外,提出了基于相似性系数的定量可靠性指标,使分子选择过程中的决策更加明智。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Accurate property predictions and reliability quantification in molecular design based on molecular similarity
A crucial step in developing high-performance chemical products is the design of their constituent molecules. Computer-aided molecular design (CAMD) has gained significant attention for its potential to accelerate and enhance this design process. The typical approach involves using machine learning models trained on existing molecular databases to predict the properties of potential molecules. From these predictions, the best candidates are selected. However, prediction errors can occur, leading to unreliability in the design and limiting the effectiveness of molecular discovery. To tackle this issue, this paper presents a novel framework for modeling molecular properties based on a similarity coefficient. This framework introduces a new formula for assessing molecular similarity. By calculating the similarity between a target molecule and those in an existing database, the framework selects the most similar molecules, creating a tailored dataset for model training. This significantly enhances the accuracy of property predictions. Additionally, a quantitative reliability index is proposed based on the similarity coefficient, which allows for more informed decision-making during the molecular selection process.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信