Holographic-(V)AE: An end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space.

IF 4.2

Physical review research Pub Date : 2024-04-01 DOI:10.1103/physrevresearch.6.023006

Gian Marco Visani, Michael N Pun, Arman Angaji, Armita Nourmohammad

{"title":"Holographic-(V)AE: An end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space.","authors":"Gian Marco Visani, Michael N Pun, Arman Angaji, Armita Nourmohammad","doi":"10.1103/physrevresearch.6.023006","DOIUrl":null,"url":null,"abstract":"Group-equivariant neural networks have emerged as an efficient approach to model complex data, using generalized convolutions that respect the relevant symmetries of a system. These techniques have made advances in both the supervised learning tasks for classification and regression, and the unsupervised tasks to generate new data. However, little work has been done in leveraging the symmetry-aware expressive representations that could be extracted from these approaches. Here, we present holographic-(variational) autoencoder [H-(V)AE], a fully end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space, suitable for unsupervised learning and generation of data distributed around a specified origin in 3D. H-(V)AE is trained to reconstruct the spherical Fourier encoding of data, learning in the process a low-dimensional representation of the data (i.e., a latent space) with a maximally informative rotationally invariant embedding alongside an equivariant frame describing the orientation of the data. We extensively test the performance of H-(V)AE on diverse datasets. We show that the learned latent space efficiently encodes the categorical features of spherical images. Moreover, the low-dimensional representations learned by H-VAE can be used for downstream data-scarce tasks. Specifically, we show that H-(V)AE's latent space can be used to extract compact embeddings for protein structure microenvironments, and when paired with a random forest regressor, it enables state-of-the-art predictions of protein-ligand binding affinity.","PeriodicalId":520315,"journal":{"name":"Physical review research","volume":"6 2","pages":""},"PeriodicalIF":4.2000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11661850/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical review research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1103/physrevresearch.6.023006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Group-equivariant neural networks have emerged as an efficient approach to model complex data, using generalized convolutions that respect the relevant symmetries of a system. These techniques have made advances in both the supervised learning tasks for classification and regression, and the unsupervised tasks to generate new data. However, little work has been done in leveraging the symmetry-aware expressive representations that could be extracted from these approaches. Here, we present holographic-(variational) autoencoder [H-(V)AE], a fully end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space, suitable for unsupervised learning and generation of data distributed around a specified origin in 3D. H-(V)AE is trained to reconstruct the spherical Fourier encoding of data, learning in the process a low-dimensional representation of the data (i.e., a latent space) with a maximally informative rotationally invariant embedding alongside an equivariant frame describing the orientation of the data. We extensively test the performance of H-(V)AE on diverse datasets. We show that the learned latent space efficiently encodes the categorical features of spherical images. Moreover, the low-dimensional representations learned by H-VAE can be used for downstream data-scarce tasks. Specifically, we show that H-(V)AE's latent space can be used to extract compact embeddings for protein structure microenvironments, and when paired with a random forest regressor, it enables state-of-the-art predictions of protein-ligand binding affinity.

查看原文本刊更多论文

全息-(V)AE：傅里叶空间中端到端SO(3)-等变（变分）自编码器。

群等变神经网络已经成为一种有效的复杂数据建模方法，它使用尊重系统相关对称性的广义卷积。这些技术在分类和回归的监督学习任务和生成新数据的无监督任务方面都取得了进展。然而，在利用可以从这些方法中提取的对称感知表达表示方面所做的工作很少。在这里，我们提出了全息-（变分）自编码器[H-(V)AE]，这是一种在傅里叶空间中完全端到端的SO(3)-等变（变分）自编码器，适用于无监督学习和在3D中分布在指定原点周围的数据生成。H-(V)AE被训练来重建数据的球面傅立叶编码，在此过程中学习数据的低维表示（即潜在空间），并具有最大信息量的旋转不变嵌入以及描述数据方向的等变框架。我们在不同的数据集上广泛测试了H-(V)AE的性能。研究表明，学习到的潜空间可以有效地编码球形图像的分类特征。此外，H-VAE学习到的低维表示可以用于下游的数据稀缺任务。具体来说，我们表明H-(V)AE的潜在空间可以用来提取蛋白质结构微环境的紧凑嵌入，当与随机森林回归器配对时，它可以预测蛋白质与配体结合的亲和力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Physical review research

自引率

0.00%

发文量