Predicting disease associations based on the higher order structure of ceRNA networks.

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics Pub Date : 2025-08-31 DOI:10.1093/bib/bbaf518

Zhaoliang Chai, Ying Su, Xuecong Tian, Chen Chen, Xiaoyi Lv, Cheng Chen

{"title":"Predicting disease associations based on the higher order structure of ceRNA networks.","authors":"Zhaoliang Chai, Ying Su, Xuecong Tian, Chen Chen, Xiaoyi Lv, Cheng Chen","doi":"10.1093/bib/bbaf518","DOIUrl":null,"url":null,"abstract":"<p><p>Competitive endogenous RNA (ceRNA) network regulation is an important posttranscriptional regulatory mechanism that plays an important role in physiological and pathological processes, and has been widely used in biomarker screening and regulatory factor studies of disease-related genes. However, existing studies have mainly focused on the association of a single type of RNA with disease, while studies targeting the application of ceRNA networks in disease prediction are still limited, so it is crucial to explore the potential of ceRNA networks in disease prediction. In this study, we propose CERDA-HOSR, a computational method for mining ceRNA network-disease associations based on higher order graph attention networks. The method uses higher order graph convolutional networks to aggregate neighborhood information to generate representations of different RNAs and diseases. Given the higher order complexity of biological networks and sample imbalance problem, traditional random negative sampling is difficult to effectively capture global information; for this reason, a higher order negative sampling strategy is designed to optimize the quality of negative samples by combining the network structure and higher order neighborhood relations to improve the generalization ability and prediction accuracy of the model. Finally, LightGBM calculates the ceRNA network-disease association probability based on the learned embedding. A large number of simulation experiments validate the superiority of CERDA-HOSR, and its practical application is further demonstrated by case studies of cardiovascular disease, acute myeloid leukemia, and papillary thyroid cancer. In addition, ablation experiments and exploratory analyses further enhance its robustness and provide an effective tool for disease prediction and biomarker screening.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12495994/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf518","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Competitive endogenous RNA (ceRNA) network regulation is an important posttranscriptional regulatory mechanism that plays an important role in physiological and pathological processes, and has been widely used in biomarker screening and regulatory factor studies of disease-related genes. However, existing studies have mainly focused on the association of a single type of RNA with disease, while studies targeting the application of ceRNA networks in disease prediction are still limited, so it is crucial to explore the potential of ceRNA networks in disease prediction. In this study, we propose CERDA-HOSR, a computational method for mining ceRNA network-disease associations based on higher order graph attention networks. The method uses higher order graph convolutional networks to aggregate neighborhood information to generate representations of different RNAs and diseases. Given the higher order complexity of biological networks and sample imbalance problem, traditional random negative sampling is difficult to effectively capture global information; for this reason, a higher order negative sampling strategy is designed to optimize the quality of negative samples by combining the network structure and higher order neighborhood relations to improve the generalization ability and prediction accuracy of the model. Finally, LightGBM calculates the ceRNA network-disease association probability based on the learned embedding. A large number of simulation experiments validate the superiority of CERDA-HOSR, and its practical application is further demonstrated by case studies of cardiovascular disease, acute myeloid leukemia, and papillary thyroid cancer. In addition, ablation experiments and exploratory analyses further enhance its robustness and provide an effective tool for disease prediction and biomarker screening.

查看原文本刊更多论文

基于ceRNA网络高阶结构的疾病关联预测。

竞争性内源性RNA （Competitive endogenous RNA, ceRNA）网络调控是一种重要的转录后调控机制，在生理和病理过程中发挥着重要作用，已广泛应用于疾病相关基因的生物标志物筛选和调控因子研究。然而，现有的研究主要集中在单一类型RNA与疾病的关联上，而针对ceRNA网络在疾病预测中的应用的研究仍然有限，因此探索ceRNA网络在疾病预测中的潜力至关重要。在这项研究中，我们提出了CERDA-HOSR，一种基于高阶图关注网络的ceRNA网络疾病关联挖掘的计算方法。该方法使用高阶图卷积网络聚合邻域信息，生成不同rna和疾病的表示。由于生物网络的高阶复杂性和样本不平衡问题，传统的随机负抽样难以有效捕获全局信息；为此，设计了一种高阶负抽样策略，结合网络结构和高阶邻域关系优化负样本质量，提高模型的泛化能力和预测精度。最后，LightGBM计算基于学习嵌入的ceRNA网络疾病关联概率。大量仿真实验验证了CERDA-HOSR的优越性，并通过心血管疾病、急性髓性白血病、甲状腺乳头状癌的病例研究进一步验证了其实际应用。此外，消融实验和探索性分析进一步增强了其稳健性，为疾病预测和生物标志物筛选提供了有效工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Briefings in bioinformatics 生物-生化研究方法

CiteScore

13.20

自引率

13.70%

发文量

549

审稿时长

6 months

期刊介绍： Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.