ChiralCat:使用可学习查询增强空间表示的分子手性分类

Yichuan Peng , Gufeng Yu , Runhan Shi , Letian Chen , Xi Wang , Wenjie Du , Xiaohong Huo , Yang Yang
{"title":"ChiralCat:使用可学习查询增强空间表示的分子手性分类","authors":"Yichuan Peng ,&nbsp;Gufeng Yu ,&nbsp;Runhan Shi ,&nbsp;Letian Chen ,&nbsp;Xi Wang ,&nbsp;Wenjie Du ,&nbsp;Xiaohong Huo ,&nbsp;Yang Yang","doi":"10.1016/j.aichem.2025.100091","DOIUrl":null,"url":null,"abstract":"<div><div>Molecular chirality is a key focus of research in chemistry and biology. In nature, there are many complex categories of chirality and it can strongly alter biochemical activities and interactions, particularly in asymmetric catalysis and protein–drug binding. Despite advancements in molecular property prediction approaches, a computational method capable of identifying chiral types has been absent, impeding progress in chirality studies. This gap is primarily due to the inability of current molecular representation models to capture chiral-related spatial features and the scarcity of annotated datasets for complex chiral types. To address these limitations, we develop ChiralCat, a pioneering machine learning method for molecular chirality classification. ChiralCat’s core is a pre-trained multi-modal classifier that enhances spatial molecular representations. This is achieved through learnable queries, guided by chirality-related descriptions generated by a large language model (LLM). To facilitate the model’s training, we construct an extensive chiral molecule dataset comprising 17,181 molecules across various chiral categories. Our experimental results, both quantitative and visualized, reveal that ChiralCat outperforms existing 3D molecular representation learning models in capturing spatial information pertinent to chirality, thereby exhibiting superior capability in discerning complex chiral types.</div></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"3 2","pages":"Article 100091"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ChiralCat: Molecular chirality classification with enhanced spatial representation using learnable queries\",\"authors\":\"Yichuan Peng ,&nbsp;Gufeng Yu ,&nbsp;Runhan Shi ,&nbsp;Letian Chen ,&nbsp;Xi Wang ,&nbsp;Wenjie Du ,&nbsp;Xiaohong Huo ,&nbsp;Yang Yang\",\"doi\":\"10.1016/j.aichem.2025.100091\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Molecular chirality is a key focus of research in chemistry and biology. In nature, there are many complex categories of chirality and it can strongly alter biochemical activities and interactions, particularly in asymmetric catalysis and protein–drug binding. Despite advancements in molecular property prediction approaches, a computational method capable of identifying chiral types has been absent, impeding progress in chirality studies. This gap is primarily due to the inability of current molecular representation models to capture chiral-related spatial features and the scarcity of annotated datasets for complex chiral types. To address these limitations, we develop ChiralCat, a pioneering machine learning method for molecular chirality classification. ChiralCat’s core is a pre-trained multi-modal classifier that enhances spatial molecular representations. This is achieved through learnable queries, guided by chirality-related descriptions generated by a large language model (LLM). To facilitate the model’s training, we construct an extensive chiral molecule dataset comprising 17,181 molecules across various chiral categories. Our experimental results, both quantitative and visualized, reveal that ChiralCat outperforms existing 3D molecular representation learning models in capturing spatial information pertinent to chirality, thereby exhibiting superior capability in discerning complex chiral types.</div></div>\",\"PeriodicalId\":72302,\"journal\":{\"name\":\"Artificial intelligence chemistry\",\"volume\":\"3 2\",\"pages\":\"Article 100091\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949747725000089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747725000089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

分子手性是化学和生物学研究的热点。在自然界中,手性有许多复杂的类别,它可以强烈地改变生物化学活动和相互作用,特别是在不对称催化和蛋白质-药物结合方面。尽管分子性质预测方法取得了进步,但缺乏一种能够识别手性类型的计算方法,阻碍了手性研究的进展。这种差距主要是由于目前的分子表示模型无法捕获与手性相关的空间特征,以及缺乏针对复杂手性类型的注释数据集。为了解决这些限制,我们开发了ChiralCat,这是一种用于分子手性分类的开创性机器学习方法。ChiralCat的核心是一个预训练的多模态分类器,可以增强空间分子表征。这是通过可学习的查询实现的,由大型语言模型(LLM)生成的手性相关描述指导。为了方便模型的训练,我们构建了一个广泛的手性分子数据集,包括各种手性类别的17,181个分子。我们的实验结果,无论是定量的还是可视化的,都表明ChiralCat在捕获与手性相关的空间信息方面优于现有的3D分子表征学习模型,从而在识别复杂的手性类型方面表现出卓越的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ChiralCat: Molecular chirality classification with enhanced spatial representation using learnable queries
Molecular chirality is a key focus of research in chemistry and biology. In nature, there are many complex categories of chirality and it can strongly alter biochemical activities and interactions, particularly in asymmetric catalysis and protein–drug binding. Despite advancements in molecular property prediction approaches, a computational method capable of identifying chiral types has been absent, impeding progress in chirality studies. This gap is primarily due to the inability of current molecular representation models to capture chiral-related spatial features and the scarcity of annotated datasets for complex chiral types. To address these limitations, we develop ChiralCat, a pioneering machine learning method for molecular chirality classification. ChiralCat’s core is a pre-trained multi-modal classifier that enhances spatial molecular representations. This is achieved through learnable queries, guided by chirality-related descriptions generated by a large language model (LLM). To facilitate the model’s training, we construct an extensive chiral molecule dataset comprising 17,181 molecules across various chiral categories. Our experimental results, both quantitative and visualized, reveal that ChiralCat outperforms existing 3D molecular representation learning models in capturing spatial information pertinent to chirality, thereby exhibiting superior capability in discerning complex chiral types.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial intelligence chemistry
Artificial intelligence chemistry Chemistry (General)
自引率
0.00%
发文量
0
审稿时长
21 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信