Yan Sun, Yutong Lu, Yan Yi Li, Zihao Jing, Carson K Leung, Pingzhao Hu
{"title":"MolGraph-xLSTM是一个基于图的双层xLSTM框架,用于增强分子表示和可解释性。","authors":"Yan Sun, Yutong Lu, Yan Yi Li, Zihao Jing, Carson K Leung, Pingzhao Hu","doi":"10.1038/s42004-025-01683-z","DOIUrl":null,"url":null,"abstract":"<p><p>Predicting molecular properties is essential for drug discovery, and computational methods can greatly enhance this process. Molecular graphs have become a focus for representation learning, with Graph Neural Networks (GNNs) widely used. However, GNNs often struggle with capturing long-range dependencies. To address this, we propose MolGraph-xLSTM, a novel graph-based xLSTM model that enhances feature extraction and effectively models molecule long-range interactions. Our approach processes molecular graphs at two scales: atom-level and motif-level. For atom-level graphs, a GNN-based xLSTM framework with jumping knowledge extracts local features and aggregates multilayer information to capture both local and global patterns effectively. Motif-level graphs provide complementary structural information for a broader molecular view. Embeddings from both scales are refined via a multi-head mixture of experts (MHMoE), further enhancing expressiveness and performance. We validate MolGraph-xLSTM on 21 datasets from the MoleculeNet and Therapeutics Data Commons (TDC) benchmarks, covering both classification and regression tasks. On the MoleculeNet benchmark, our model achieves an average AUROC improvement of 3.18% for classification tasks and an RMSE reduction of 3.83% for regression tasks compared to baseline methods. On the TDC benchmark, MolGraph-xLSTM improves AUROC by 2.56%, while reducing RMSE by 3.71% on average. These results confirm the effectiveness of our model in learning generalizable molecular representations for drug discovery.</p>","PeriodicalId":10529,"journal":{"name":"Communications Chemistry","volume":"8 1","pages":"286"},"PeriodicalIF":6.2000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12480901/pdf/","citationCount":"0","resultStr":"{\"title\":\"MolGraph-xLSTM as a graph-based dual-level xLSTM framework for enhanced molecular representation and interpretability.\",\"authors\":\"Yan Sun, Yutong Lu, Yan Yi Li, Zihao Jing, Carson K Leung, Pingzhao Hu\",\"doi\":\"10.1038/s42004-025-01683-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Predicting molecular properties is essential for drug discovery, and computational methods can greatly enhance this process. Molecular graphs have become a focus for representation learning, with Graph Neural Networks (GNNs) widely used. However, GNNs often struggle with capturing long-range dependencies. To address this, we propose MolGraph-xLSTM, a novel graph-based xLSTM model that enhances feature extraction and effectively models molecule long-range interactions. Our approach processes molecular graphs at two scales: atom-level and motif-level. For atom-level graphs, a GNN-based xLSTM framework with jumping knowledge extracts local features and aggregates multilayer information to capture both local and global patterns effectively. Motif-level graphs provide complementary structural information for a broader molecular view. Embeddings from both scales are refined via a multi-head mixture of experts (MHMoE), further enhancing expressiveness and performance. We validate MolGraph-xLSTM on 21 datasets from the MoleculeNet and Therapeutics Data Commons (TDC) benchmarks, covering both classification and regression tasks. On the MoleculeNet benchmark, our model achieves an average AUROC improvement of 3.18% for classification tasks and an RMSE reduction of 3.83% for regression tasks compared to baseline methods. On the TDC benchmark, MolGraph-xLSTM improves AUROC by 2.56%, while reducing RMSE by 3.71% on average. These results confirm the effectiveness of our model in learning generalizable molecular representations for drug discovery.</p>\",\"PeriodicalId\":10529,\"journal\":{\"name\":\"Communications Chemistry\",\"volume\":\"8 1\",\"pages\":\"286\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12480901/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Communications Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1038/s42004-025-01683-z\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1038/s42004-025-01683-z","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
MolGraph-xLSTM as a graph-based dual-level xLSTM framework for enhanced molecular representation and interpretability.
Predicting molecular properties is essential for drug discovery, and computational methods can greatly enhance this process. Molecular graphs have become a focus for representation learning, with Graph Neural Networks (GNNs) widely used. However, GNNs often struggle with capturing long-range dependencies. To address this, we propose MolGraph-xLSTM, a novel graph-based xLSTM model that enhances feature extraction and effectively models molecule long-range interactions. Our approach processes molecular graphs at two scales: atom-level and motif-level. For atom-level graphs, a GNN-based xLSTM framework with jumping knowledge extracts local features and aggregates multilayer information to capture both local and global patterns effectively. Motif-level graphs provide complementary structural information for a broader molecular view. Embeddings from both scales are refined via a multi-head mixture of experts (MHMoE), further enhancing expressiveness and performance. We validate MolGraph-xLSTM on 21 datasets from the MoleculeNet and Therapeutics Data Commons (TDC) benchmarks, covering both classification and regression tasks. On the MoleculeNet benchmark, our model achieves an average AUROC improvement of 3.18% for classification tasks and an RMSE reduction of 3.83% for regression tasks compared to baseline methods. On the TDC benchmark, MolGraph-xLSTM improves AUROC by 2.56%, while reducing RMSE by 3.71% on average. These results confirm the effectiveness of our model in learning generalizable molecular representations for drug discovery.
期刊介绍:
Communications Chemistry is an open access journal from Nature Research publishing high-quality research, reviews and commentary in all areas of the chemical sciences. Research papers published by the journal represent significant advances bringing new chemical insight to a specialized area of research. We also aim to provide a community forum for issues of importance to all chemists, regardless of sub-discipline.