从变压器化学语言模型的双目标候选化合物包含特征的结构特征

Sanjana Srinivasan , Alec Lamens , Jürgen Bajorath
{"title":"从变压器化学语言模型的双目标候选化合物包含特征的结构特征","authors":"Sanjana Srinivasan ,&nbsp;Alec Lamens ,&nbsp;Jürgen Bajorath","doi":"10.1016/j.ejmcr.2025.100291","DOIUrl":null,"url":null,"abstract":"<div><div>Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.</div></div>","PeriodicalId":12015,"journal":{"name":"European Journal of Medicinal Chemistry Reports","volume":"15 ","pages":"Article 100291"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features\",\"authors\":\"Sanjana Srinivasan ,&nbsp;Alec Lamens ,&nbsp;Jürgen Bajorath\",\"doi\":\"10.1016/j.ejmcr.2025.100291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.</div></div>\",\"PeriodicalId\":12015,\"journal\":{\"name\":\"European Journal of Medicinal Chemistry Reports\",\"volume\":\"15 \",\"pages\":\"Article 100291\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Medicinal Chemistry Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772417425000470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medicinal Chemistry Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772417425000470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

化学语言模型(CLMs)越来越多地用于药物化学候选化合物的生成设计。然而,他们的预测很难合理化。目前,还没有基于clm的化合物生成的详细计算解释。因此,我们试图从药物化学的角度更好地理解clm是如何学习和达到化合物预测的。因此,我们对由变压器CLMs生成的多药理学的双靶点候选化合物进行了一系列分析步骤,探索所了解的结构特征,并将它们与具有双靶点活性的已知化合物进行比较。利用机器学习与可解释人工智能的不同化学结构导向方法相结合,我们表明clm学习已知双靶化合物的亚结构特征,作为生成具有各种化学修饰的新候选化合物的基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features

Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features
Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信