Mengmeng Wei, Lei Wang, Yang Li, Zhengwei Li, Bowei Zhao, Xiaorui Su, Yu Wei, Zhuhong You
{"title":"BioKG-CMI: a multi-source feature fusion model based on biological knowledge graph for predicting circRNA-miRNA interactions","authors":"Mengmeng Wei, Lei Wang, Yang Li, Zhengwei Li, Bowei Zhao, Xiaorui Su, Yu Wei, Zhuhong You","doi":"10.1007/s11432-024-4098-3","DOIUrl":null,"url":null,"abstract":"<p>This study proposes a model named BioKG-CMI to predict CMIs based on a biological knowledge graph. Faced with limited data, we employ subcellular localization to generate negative samples that align more closely with biological logic. To mine semantic information in circRNA and miRNA sequences, we introduce the pre-trained model BERT to learn sequence feature representation. Guided by the hypothesis that adjacent molecules have similar functions, we calculate spatial proximity between nodes of the same class. The DisMult algorithm is applied to extract the potential logical rules of the knowledge graph and learn entity and relationship representations. Subsequently, the integration of multi-feature successfully addresses the challenge of expressing the complex biological knowledge graph and overcoming the limitation of single-feature inadequacy. Multiple comparative experiments and case studies demonstrate the robustness of the proposed model.</p>","PeriodicalId":21618,"journal":{"name":"Science China Information Sciences","volume":"167 1","pages":""},"PeriodicalIF":7.3000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science China Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11432-024-4098-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
This study proposes a model named BioKG-CMI to predict CMIs based on a biological knowledge graph. Faced with limited data, we employ subcellular localization to generate negative samples that align more closely with biological logic. To mine semantic information in circRNA and miRNA sequences, we introduce the pre-trained model BERT to learn sequence feature representation. Guided by the hypothesis that adjacent molecules have similar functions, we calculate spatial proximity between nodes of the same class. The DisMult algorithm is applied to extract the potential logical rules of the knowledge graph and learn entity and relationship representations. Subsequently, the integration of multi-feature successfully addresses the challenge of expressing the complex biological knowledge graph and overcoming the limitation of single-feature inadequacy. Multiple comparative experiments and case studies demonstrate the robustness of the proposed model.
期刊介绍:
Science China Information Sciences is a dedicated journal that showcases high-quality, original research across various domains of information sciences. It encompasses Computer Science & Technologies, Control Science & Engineering, Information & Communication Engineering, Microelectronics & Solid-State Electronics, and Quantum Information, providing a platform for the dissemination of significant contributions in these fields.