Machine learning solutions for predicting protein–protein interactions

IF 16.8 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Wiley Interdisciplinary Reviews: Computational Molecular Science Pub Date : 2022-03-29 DOI:10.1002/wcms.1618

Rita Casadio, Pier Luigi Martelli, Castrense Savojardo

{"title":"Machine learning solutions for predicting protein–protein interactions","authors":"Rita Casadio, Pier Luigi Martelli, Castrense Savojardo","doi":"10.1002/wcms.1618","DOIUrl":null,"url":null,"abstract":"<p>Proteins are “social molecules.” Recent experimental evidence supports the notion that large protein aggregates, known as biomolecular condensates, affect structurally and functionally many biological processes. Condensate formation may be permanent and/or time dependent, suggesting that biological processes can occur locally, depending on the cell needs. The question then arises as to which extent we can monitor protein-aggregate formation, both experimentally and theoretically and then predict/simulate functional aggregate formation. Available data are relative to mesoscopic interacting networks at a proteome level, to protein-binding affinity data, and to interacting protein complexes, solved with atomic resolution. Powerful algorithms based on machine learning (ML) can extract information from data sets and infer properties of never-seen-before examples. ML tools address the problem of protein–protein interactions (PPIs) adopting different data sets, input features, and architectures. According to recent publications, deep learning is the most successful method. However, in ML-computational biology, convincing evidence of a success story comes out by performing general benchmarks on blind data sets. Results indicate that the state-of-the-art ML approaches, based on traditional and/or deep learning, can still be ameliorated, irrespectively of the power of the method and richness in input features. This being the case, it is quite evident that powerful methods still are not trained on the whole possible spectrum of PPIs and that more investigations are necessary to complete our knowledge of PPI-functional interactions.</p><p>This article is categorized under:\n </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"12 6","pages":""},"PeriodicalIF":16.8000,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.1618","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: Computational Molecular Science","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/wcms.1618","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 19

Abstract

Proteins are “social molecules.” Recent experimental evidence supports the notion that large protein aggregates, known as biomolecular condensates, affect structurally and functionally many biological processes. Condensate formation may be permanent and/or time dependent, suggesting that biological processes can occur locally, depending on the cell needs. The question then arises as to which extent we can monitor protein-aggregate formation, both experimentally and theoretically and then predict/simulate functional aggregate formation. Available data are relative to mesoscopic interacting networks at a proteome level, to protein-binding affinity data, and to interacting protein complexes, solved with atomic resolution. Powerful algorithms based on machine learning (ML) can extract information from data sets and infer properties of never-seen-before examples. ML tools address the problem of protein–protein interactions (PPIs) adopting different data sets, input features, and architectures. According to recent publications, deep learning is the most successful method. However, in ML-computational biology, convincing evidence of a success story comes out by performing general benchmarks on blind data sets. Results indicate that the state-of-the-art ML approaches, based on traditional and/or deep learning, can still be ameliorated, irrespectively of the power of the method and richness in input features. This being the case, it is quite evident that powerful methods still are not trained on the whole possible spectrum of PPIs and that more investigations are necessary to complete our knowledge of PPI-functional interactions.

This article is categorized under:

查看原文本刊更多论文

预测蛋白质相互作用的机器学习解决方案

蛋白质是“社会分子”。最近的实验证据支持这样一种观点，即大的蛋白质聚集体，被称为生物分子凝聚物，在结构和功能上影响许多生物过程。凝析物的形成可能是永久性的和/或时间依赖性的，这表明生物过程可以根据细胞的需要局部发生。接下来的问题是，我们可以在多大程度上通过实验和理论上监测蛋白质聚集体的形成，然后预测/模拟功能聚集体的形成。可用的数据是相对于介观相互作用网络在蛋白质组水平，蛋白质结合亲和数据，相互作用的蛋白质复合物，解决与原子分辨率。基于机器学习(ML)的强大算法可以从数据集中提取信息，并推断出从未见过的示例的属性。机器学习工具解决了采用不同数据集、输入特征和架构的蛋白质-蛋白质相互作用(ppi)问题。根据最近的出版物，深度学习是最成功的方法。然而，在机器学习计算生物学中，成功故事的令人信服的证据是通过对盲数据集进行一般基准测试得出的。结果表明，基于传统和/或深度学习的最先进的机器学习方法仍然可以改进，无论方法的功能和输入特征的丰富程度如何。在这种情况下，很明显，在ppi的整个可能范围内，仍然没有训练出强大的方法，需要更多的研究来完成我们对ppi -功能相互作用的了解。本文分类如下:

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Wiley Interdisciplinary Reviews: Computational Molecular Science CHEMISTRY, MULTIDISCIPLINARY-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

28.90

自引率

1.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Computational molecular sciences harness the power of rigorous chemical and physical theories, employing computer-based modeling, specialized hardware, software development, algorithm design, and database management to explore and illuminate every facet of molecular sciences. These interdisciplinary approaches form a bridge between chemistry, biology, and materials sciences, establishing connections with adjacent application-driven fields in both chemistry and biology. WIREs Computational Molecular Science stands as a platform to comprehensively review and spotlight research from these dynamic and interconnected fields.