从变压器化学语言模型的双目标候选化合物包含特征的结构特征

European Journal of Medicinal Chemistry Reports Pub Date : 2025-07-30 DOI:10.1016/j.ejmcr.2025.100291

Sanjana Srinivasan , Alec Lamens , Jürgen Bajorath

{"title":"从变压器化学语言模型的双目标候选化合物包含特征的结构特征","authors":"Sanjana Srinivasan , Alec Lamens , Jürgen Bajorath","doi":"10.1016/j.ejmcr.2025.100291","DOIUrl":null,"url":null,"abstract":"<div><div>Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.</div></div>","PeriodicalId":12015,"journal":{"name":"European Journal of Medicinal Chemistry Reports","volume":"15 ","pages":"Article 100291"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features\",\"authors\":\"Sanjana Srinivasan , Alec Lamens , Jürgen Bajorath\",\"doi\":\"10.1016/j.ejmcr.2025.100291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.</div></div>\",\"PeriodicalId\":12015,\"journal\":{\"name\":\"European Journal of Medicinal Chemistry Reports\",\"volume\":\"15 \",\"pages\":\"Article 100291\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Medicinal Chemistry Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772417425000470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medicinal Chemistry Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772417425000470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

化学语言模型（CLMs）越来越多地用于药物化学候选化合物的生成设计。然而，他们的预测很难合理化。目前，还没有基于clm的化合物生成的详细计算解释。因此，我们试图从药物化学的角度更好地理解clm是如何学习和达到化合物预测的。因此，我们对由变压器CLMs生成的多药理学的双靶点候选化合物进行了一系列分析步骤，探索所了解的结构特征，并将它们与具有双靶点活性的已知化合物进行比较。利用机器学习与可解释人工智能的不同化学结构导向方法相结合，我们表明clm学习已知双靶化合物的亚结构特征，作为生成具有各种化学修饰的新候选化合物的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features

查看原文本刊更多论文

Dual-target candidate compounds from a transformer chemical language model contain characteristic structural features

Chemical language models (CLMs) are increasingly used for generative design of candidate compounds for medicinal chemistry. However, their predictions are difficult to rationalize. Currently, detailed computational explanations of CLM-based compound generation are unavailable. Therefore, we have attempted to better understand from a medicinal chemistry perspective how CLMs learn and arrive at compound predictions. Therefore, we have subjected dual-target candidate compounds for polypharmacology generated with transformer CLMs to a series of analysis steps exploring structural features that are learned and compared them to known compounds with dual-target activity. Using machine learning combined with distinct chemical structure-oriented approaches from explainable artificial intelligence, we show that CLMs learn substructures characteristic of known dual-target compounds as a basis for generating new candidates with various chemical modifications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Medicinal Chemistry Reports

CiteScore

4.50

自引率

0.00%

发文量