Yichuan Peng , Gufeng Yu , Runhan Shi , Letian Chen , Xi Wang , Wenjie Du , Xiaohong Huo , Yang Yang
{"title":"ChiralCat: Molecular chirality classification with enhanced spatial representation using learnable queries","authors":"Yichuan Peng , Gufeng Yu , Runhan Shi , Letian Chen , Xi Wang , Wenjie Du , Xiaohong Huo , Yang Yang","doi":"10.1016/j.aichem.2025.100091","DOIUrl":null,"url":null,"abstract":"<div><div>Molecular chirality is a key focus of research in chemistry and biology. In nature, there are many complex categories of chirality and it can strongly alter biochemical activities and interactions, particularly in asymmetric catalysis and protein–drug binding. Despite advancements in molecular property prediction approaches, a computational method capable of identifying chiral types has been absent, impeding progress in chirality studies. This gap is primarily due to the inability of current molecular representation models to capture chiral-related spatial features and the scarcity of annotated datasets for complex chiral types. To address these limitations, we develop ChiralCat, a pioneering machine learning method for molecular chirality classification. ChiralCat’s core is a pre-trained multi-modal classifier that enhances spatial molecular representations. This is achieved through learnable queries, guided by chirality-related descriptions generated by a large language model (LLM). To facilitate the model’s training, we construct an extensive chiral molecule dataset comprising 17,181 molecules across various chiral categories. Our experimental results, both quantitative and visualized, reveal that ChiralCat outperforms existing 3D molecular representation learning models in capturing spatial information pertinent to chirality, thereby exhibiting superior capability in discerning complex chiral types.</div></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"3 2","pages":"Article 100091"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747725000089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Molecular chirality is a key focus of research in chemistry and biology. In nature, there are many complex categories of chirality and it can strongly alter biochemical activities and interactions, particularly in asymmetric catalysis and protein–drug binding. Despite advancements in molecular property prediction approaches, a computational method capable of identifying chiral types has been absent, impeding progress in chirality studies. This gap is primarily due to the inability of current molecular representation models to capture chiral-related spatial features and the scarcity of annotated datasets for complex chiral types. To address these limitations, we develop ChiralCat, a pioneering machine learning method for molecular chirality classification. ChiralCat’s core is a pre-trained multi-modal classifier that enhances spatial molecular representations. This is achieved through learnable queries, guided by chirality-related descriptions generated by a large language model (LLM). To facilitate the model’s training, we construct an extensive chiral molecule dataset comprising 17,181 molecules across various chiral categories. Our experimental results, both quantitative and visualized, reveal that ChiralCat outperforms existing 3D molecular representation learning models in capturing spatial information pertinent to chirality, thereby exhibiting superior capability in discerning complex chiral types.