生物医学二部网络链接预测的多视图融合研究：方法与应用

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion Pub Date : 2024-12-24 DOI:10.1016/j.inffus.2024.102894

Yuqing Qian, Yizheng Wang, Junkai Liu, Quan Zou, Yijie Ding, Xiaoyi Guo, Weiping Ding

{"title":"生物医学二部网络链接预测的多视图融合研究：方法与应用","authors":"Yuqing Qian, Yizheng Wang, Junkai Liu, Quan Zou, Yijie Ding, Xiaoyi Guo, Weiping Ding","doi":"10.1016/j.inffus.2024.102894","DOIUrl":null,"url":null,"abstract":"Biomedical research increasingly relies on the analysis of complex interactions between biological entities, such as genes, proteins, and drugs. Although advancements in biomedical technologies have led to a vast accumulation of relational data, the high cost and time demands of wet-lab experiments have limited the number of verified interactions. Thus, computational methods have become essential for predicting potential links by leveraging diverse datasets to efficiently and accurately identify promising interactions. Multi-view fusion, which combines complementary information from multiple sources, has shown significant promise for enhancing the prediction accuracy and robustness. We introduce the framework of multi-view fusion methods by elaborating on key components. This includes a comprehensive examination of multi-view data sources covering various omics and biological databases. We then describe the feature extraction techniques and explore how meaningful features can be derived from heterogeneous data formats. Next, we offer an in-depth review of the fusion strategies and categorize them as early fusion, late fusion, and fusion during the training phase. We discuss the advantages and limitations of each approach, emphasizing the need for sophisticated techniques that consider the unique attributes of biological link prediction. We also provide an overview of the commonly used datasets, evaluation metrics, and validation techniques. Commonly used datasets serve as reliable benchmarks for evaluating the computational models. Evaluation metrics and validation techniques are crucial for reliably assessing the performances of link prediction models. Subsequently, a comparative analysis of different fusion methods is conducted to empirically evaluate their performances on widely available biomedical datasets. This yielded valuable insights into the strengths and limitations of each approach in real-world applications. Finally, we identify key obstacles such as data heterogeneity, model robustness, and missing data and suggest potential directions for future research. Our findings offer valuable insights into the applications and future directions of multi-view fusion methods for biomedical link prediction, highlighting their potential to accelerate discovery and innovation in the field.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"9 1 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey on multi-view fusion for predicting links in biomedical bipartite networks: Methods and applications\",\"authors\":\"Yuqing Qian, Yizheng Wang, Junkai Liu, Quan Zou, Yijie Ding, Xiaoyi Guo, Weiping Ding\",\"doi\":\"10.1016/j.inffus.2024.102894\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biomedical research increasingly relies on the analysis of complex interactions between biological entities, such as genes, proteins, and drugs. Although advancements in biomedical technologies have led to a vast accumulation of relational data, the high cost and time demands of wet-lab experiments have limited the number of verified interactions. Thus, computational methods have become essential for predicting potential links by leveraging diverse datasets to efficiently and accurately identify promising interactions. Multi-view fusion, which combines complementary information from multiple sources, has shown significant promise for enhancing the prediction accuracy and robustness. We introduce the framework of multi-view fusion methods by elaborating on key components. This includes a comprehensive examination of multi-view data sources covering various omics and biological databases. We then describe the feature extraction techniques and explore how meaningful features can be derived from heterogeneous data formats. Next, we offer an in-depth review of the fusion strategies and categorize them as early fusion, late fusion, and fusion during the training phase. We discuss the advantages and limitations of each approach, emphasizing the need for sophisticated techniques that consider the unique attributes of biological link prediction. We also provide an overview of the commonly used datasets, evaluation metrics, and validation techniques. Commonly used datasets serve as reliable benchmarks for evaluating the computational models. Evaluation metrics and validation techniques are crucial for reliably assessing the performances of link prediction models. Subsequently, a comparative analysis of different fusion methods is conducted to empirically evaluate their performances on widely available biomedical datasets. This yielded valuable insights into the strengths and limitations of each approach in real-world applications. Finally, we identify key obstacles such as data heterogeneity, model robustness, and missing data and suggest potential directions for future research. Our findings offer valuable insights into the applications and future directions of multi-view fusion methods for biomedical link prediction, highlighting their potential to accelerate discovery and innovation in the field.\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"9 1 1\",\"pages\":\"\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2024-12-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1016/j.inffus.2024.102894\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1016/j.inffus.2024.102894","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

生物医学研究越来越依赖于对生物实体（如基因、蛋白质和药物）之间复杂相互作用的分析。尽管生物医学技术的进步导致了大量相关数据的积累，但湿实验室实验的高成本和时间要求限制了验证相互作用的数量。因此，通过利用不同的数据集有效、准确地识别有希望的相互作用，计算方法已经成为预测潜在联系的关键。多视图融合将多源互补信息结合在一起，在提高预测精度和鲁棒性方面具有重要的前景。通过对关键组件的阐述，介绍了多视图融合方法的框架。这包括对涵盖各种组学和生物学数据库的多视图数据源的全面检查。然后，我们描述了特征提取技术，并探讨了如何从异构数据格式中获得有意义的特征。接下来，我们对融合策略进行了深入的回顾，并将其分为早期融合、晚期融合和训练阶段的融合。我们讨论了每种方法的优点和局限性，强调需要考虑生物链接预测的独特属性的复杂技术。我们还概述了常用的数据集、评估指标和验证技术。常用的数据集可作为评估计算模型的可靠基准。评估指标和验证技术对于可靠地评估链路预测模型的性能至关重要。随后，对不同的融合方法进行了比较分析，以经验评估其在广泛可用的生物医学数据集上的性能。这产生了对每种方法在实际应用程序中的优点和局限性的有价值的见解。最后，我们确定了关键障碍，如数据异质性、模型稳健性和缺失数据，并提出了未来研究的潜在方向。我们的研究结果为多视角融合方法在生物医学链接预测中的应用和未来方向提供了有价值的见解，突出了它们在加速该领域发现和创新方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A survey on multi-view fusion for predicting links in biomedical bipartite networks: Methods and applications

Biomedical research increasingly relies on the analysis of complex interactions between biological entities, such as genes, proteins, and drugs. Although advancements in biomedical technologies have led to a vast accumulation of relational data, the high cost and time demands of wet-lab experiments have limited the number of verified interactions. Thus, computational methods have become essential for predicting potential links by leveraging diverse datasets to efficiently and accurately identify promising interactions. Multi-view fusion, which combines complementary information from multiple sources, has shown significant promise for enhancing the prediction accuracy and robustness. We introduce the framework of multi-view fusion methods by elaborating on key components. This includes a comprehensive examination of multi-view data sources covering various omics and biological databases. We then describe the feature extraction techniques and explore how meaningful features can be derived from heterogeneous data formats. Next, we offer an in-depth review of the fusion strategies and categorize them as early fusion, late fusion, and fusion during the training phase. We discuss the advantages and limitations of each approach, emphasizing the need for sophisticated techniques that consider the unique attributes of biological link prediction. We also provide an overview of the commonly used datasets, evaluation metrics, and validation techniques. Commonly used datasets serve as reliable benchmarks for evaluating the computational models. Evaluation metrics and validation techniques are crucial for reliably assessing the performances of link prediction models. Subsequently, a comparative analysis of different fusion methods is conducted to empirically evaluate their performances on widely available biomedical datasets. This yielded valuable insights into the strengths and limitations of each approach in real-world applications. Finally, we identify key obstacles such as data heterogeneity, model robustness, and missing data and suggest potential directions for future research. Our findings offer valuable insights into the applications and future directions of multi-view fusion methods for biomedical link prediction, highlighting their potential to accelerate discovery and innovation in the field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Fusion 工程技术-计算机：理论方法

CiteScore

33.20

自引率

4.30%

发文量

161

审稿时长

7.9 months

期刊介绍： Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.