Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi
{"title":"Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks","authors":"Ashwin Samudre, Mircea Petrache, Brian D. Nord, Shubhendu Trivedi","doi":"arxiv-2409.11772","DOIUrl":null,"url":null,"abstract":"There has been much recent interest in designing symmetry-aware neural\nnetworks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolate\nbetween being exactly equivariant and being fully flexible, affording\nconsistent performance benefits. In a separate line of work, certain structured\nparameter matrices -- those with displacement structure, characterized by low\ndisplacement rank (LDR) -- have been used to design small-footprint NNs.\nDisplacement structure enables fast function and gradient evaluation, but\npermits accurate approximations via compression primarily to classical\nconvolutional neural networks (CNNs). In this work, we propose a general\nframework -- based on a novel construction of symmetry-based structured\nmatrices -- to build approximately equivariant NNs with significantly reduced\nparameter counts. Our framework integrates the two aforementioned lines of work\nvia the use of so-called Group Matrices (GMs), a forgotten precursor to the\nmodern notion of regular representations of finite groups. GMs allow the design\nof structured matrices -- resembling LDR matrices -- which generalize the\nlinear operations of a classical CNN from cyclic groups to general finite\ngroups and their homogeneous spaces. We show that GMs can be employed to extend\nall the elementary operations of CNNs to general discrete groups. Further, the\ntheory of structured matrices based on GMs provides a generalization of LDR\ntheory focussed on matrices with cyclic structure, providing a tool for\nimplementing approximate equivariance for discrete groups. We test GM-based\narchitectures on a variety of tasks in the presence of relaxed symmetry. We\nreport that our framework consistently performs competitively compared to\napproximately equivariant NNs, and other structured matrix-based compression\nframeworks, sometimes with a one or two orders of magnitude lower parameter\ncount.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11772","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
There has been much recent interest in designing symmetry-aware neural
networks (NNs) exhibiting relaxed equivariance. Such NNs aim to interpolate
between being exactly equivariant and being fully flexible, affording
consistent performance benefits. In a separate line of work, certain structured
parameter matrices -- those with displacement structure, characterized by low
displacement rank (LDR) -- have been used to design small-footprint NNs.
Displacement structure enables fast function and gradient evaluation, but
permits accurate approximations via compression primarily to classical
convolutional neural networks (CNNs). In this work, we propose a general
framework -- based on a novel construction of symmetry-based structured
matrices -- to build approximately equivariant NNs with significantly reduced
parameter counts. Our framework integrates the two aforementioned lines of work
via the use of so-called Group Matrices (GMs), a forgotten precursor to the
modern notion of regular representations of finite groups. GMs allow the design
of structured matrices -- resembling LDR matrices -- which generalize the
linear operations of a classical CNN from cyclic groups to general finite
groups and their homogeneous spaces. We show that GMs can be employed to extend
all the elementary operations of CNNs to general discrete groups. Further, the
theory of structured matrices based on GMs provides a generalization of LDR
theory focussed on matrices with cyclic structure, providing a tool for
implementing approximate equivariance for discrete groups. We test GM-based
architectures on a variety of tasks in the presence of relaxed symmetry. We
report that our framework consistently performs competitively compared to
approximately equivariant NNs, and other structured matrix-based compression
frameworks, sometimes with a one or two orders of magnitude lower parameter
count.