GraphSLA：用于预测小分子- lncRNA关联的图机器学习

Artificial intelligence chemistry Pub Date : 2025-08-11 DOI:10.1016/j.aichem.2025.100094

Ashish Panghalia, Parth Kumar, Vikram Singh

{"title":"GraphSLA：用于预测小分子- lncRNA关联的图机器学习","authors":"Ashish Panghalia, Parth Kumar, Vikram Singh","doi":"10.1016/j.aichem.2025.100094","DOIUrl":null,"url":null,"abstract":"<div><div>Long non-coding RNAs are increasingly reported to have critical roles in gene expression, regulation of cellular processes, and in the onset and manifestation of various diseases. Recent studies have highlighted the role of small molecules (SMs) in controlling the functioning of lncRNAs, making SM-lncRNA associations (SLAs) a promising approach for therapeutic development. In this study, using 3563 curated SLAs among 115 SMs and 2826 lncRNAs, five graph learning algorithms are developed for the SLA classification. Node2Vec was used to extract the contextual features of SMs and lncRNAs from their bipartite association network, while Mol2Vec and Doc2Vec algorithms were used for the extraction of molecular features of the SMs and lncRNAs, respectively. Principal components corresponding to the 95 % variability in feature vectors were used to train five graph-learning models, namely, Graph Neural Network (GNN), Graph Convolutional Network (GCN), Graph Attention Network (GAT), Graph Sample and Aggregate (GraphSAGE), and Simplified Graph Convolution (SGConv). Among these five models, GraphSAGE achieved the best performance with an accuracy of 98.0 % and an AUC-ROC of 99.4 % when evaluated over 10 training epochs. Generalizability studies were also conducted to assess whether the developed models maintain robustness, reliability, and practical utility when applied to real-world data. The overall results reported in this work exhibit better performance over previously developed SLA prediction methods. This study underscores the potential of graph-learning methods to effectively capture the intricate associations among SMs and lncRNAs, facilitating the discovery of novel SLAs.</div></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":"3 2","pages":"Article 100094"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GraphSLA: Graph machine learning for predicting small molecule - lncRNA associations\",\"authors\":\"Ashish Panghalia, Parth Kumar, Vikram Singh\",\"doi\":\"10.1016/j.aichem.2025.100094\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Long non-coding RNAs are increasingly reported to have critical roles in gene expression, regulation of cellular processes, and in the onset and manifestation of various diseases. Recent studies have highlighted the role of small molecules (SMs) in controlling the functioning of lncRNAs, making SM-lncRNA associations (SLAs) a promising approach for therapeutic development. In this study, using 3563 curated SLAs among 115 SMs and 2826 lncRNAs, five graph learning algorithms are developed for the SLA classification. Node2Vec was used to extract the contextual features of SMs and lncRNAs from their bipartite association network, while Mol2Vec and Doc2Vec algorithms were used for the extraction of molecular features of the SMs and lncRNAs, respectively. Principal components corresponding to the 95 % variability in feature vectors were used to train five graph-learning models, namely, Graph Neural Network (GNN), Graph Convolutional Network (GCN), Graph Attention Network (GAT), Graph Sample and Aggregate (GraphSAGE), and Simplified Graph Convolution (SGConv). Among these five models, GraphSAGE achieved the best performance with an accuracy of 98.0 % and an AUC-ROC of 99.4 % when evaluated over 10 training epochs. Generalizability studies were also conducted to assess whether the developed models maintain robustness, reliability, and practical utility when applied to real-world data. The overall results reported in this work exhibit better performance over previously developed SLA prediction methods. This study underscores the potential of graph-learning methods to effectively capture the intricate associations among SMs and lncRNAs, facilitating the discovery of novel SLAs.</div></div>\",\"PeriodicalId\":72302,\"journal\":{\"name\":\"Artificial intelligence chemistry\",\"volume\":\"3 2\",\"pages\":\"Article 100094\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949747725000119\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747725000119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

越来越多的报道称，长链非编码rna在基因表达、细胞过程调控以及各种疾病的发病和表现中发挥着关键作用。最近的研究强调了小分子（SMs）在控制lncrna功能中的作用，使SM-lncRNA关联（sla）成为一种有前景的治疗开发方法。在本研究中，使用115个SMs和2826个lncrna中的3563个策划SLA，开发了五种用于SLA分类的图学习算法。使用Node2Vec算法从SMs和lncrna的二部关联网络中提取上下文特征，使用Mol2Vec和Doc2Vec算法分别提取SMs和lncrna的分子特征。利用特征向量中95% %变异率对应的主成分训练5个图学习模型，分别是图神经网络（GNN）、图卷积网络（GCN）、图注意网络（GAT）、图样本与聚合（GraphSAGE）和简化图卷积（SGConv）。在这5个模型中，GraphSAGE在超过10个训练epoch的评估中，准确率达到98.0 %，AUC-ROC达到99.4 %。还进行了概括性研究，以评估所开发的模型在应用于真实世界数据时是否保持稳健性、可靠性和实用性。本工作报告的总体结果比以前开发的SLA预测方法表现出更好的性能。本研究强调了图学习方法在有效捕获SMs和lncrna之间复杂关联方面的潜力，从而促进了新的sla的发现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GraphSLA: Graph machine learning for predicting small molecule - lncRNA associations

Long non-coding RNAs are increasingly reported to have critical roles in gene expression, regulation of cellular processes, and in the onset and manifestation of various diseases. Recent studies have highlighted the role of small molecules (SMs) in controlling the functioning of lncRNAs, making SM-lncRNA associations (SLAs) a promising approach for therapeutic development. In this study, using 3563 curated SLAs among 115 SMs and 2826 lncRNAs, five graph learning algorithms are developed for the SLA classification. Node2Vec was used to extract the contextual features of SMs and lncRNAs from their bipartite association network, while Mol2Vec and Doc2Vec algorithms were used for the extraction of molecular features of the SMs and lncRNAs, respectively. Principal components corresponding to the 95 % variability in feature vectors were used to train five graph-learning models, namely, Graph Neural Network (GNN), Graph Convolutional Network (GCN), Graph Attention Network (GAT), Graph Sample and Aggregate (GraphSAGE), and Simplified Graph Convolution (SGConv). Among these five models, GraphSAGE achieved the best performance with an accuracy of 98.0 % and an AUC-ROC of 99.4 % when evaluated over 10 training epochs. Generalizability studies were also conducted to assess whether the developed models maintain robustness, reliability, and practical utility when applied to real-world data. The overall results reported in this work exhibit better performance over previously developed SLA prediction methods. This study underscores the potential of graph-learning methods to effectively capture the intricate associations among SMs and lncRNAs, facilitating the discovery of novel SLAs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial intelligence chemistry Chemistry (General)

自引率

0.00%

发文量

审稿时长

21 days