Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero
{"title":"相对表示法:拓扑和几何视角","authors":"Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero","doi":"arxiv-2409.10967","DOIUrl":null,"url":null,"abstract":"Relative representations are an established approach to zero-shot model\nstitching, consisting of a non-trainable transformation of the latent space of\na deep neural network. Based on insights of topological and geometric nature,\nwe propose two improvements to relative representations. First, we introduce a\nnormalization procedure in the relative transformation, resulting in invariance\nto non-isotropic rescalings and permutations. The latter coincides with the\nsymmetries in parameter space induced by common activation functions. Second,\nwe propose to deploy topological densification when fine-tuning relative\nrepresentations, a topological regularization loss encouraging clustering\nwithin classes. We provide an empirical investigation on a natural language\ntask, where both the proposed variations yield improved performance on\nzero-shot model stitching.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relative Representations: Topological and Geometric Perspectives\",\"authors\":\"Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero\",\"doi\":\"arxiv-2409.10967\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Relative representations are an established approach to zero-shot model\\nstitching, consisting of a non-trainable transformation of the latent space of\\na deep neural network. Based on insights of topological and geometric nature,\\nwe propose two improvements to relative representations. First, we introduce a\\nnormalization procedure in the relative transformation, resulting in invariance\\nto non-isotropic rescalings and permutations. The latter coincides with the\\nsymmetries in parameter space induced by common activation functions. Second,\\nwe propose to deploy topological densification when fine-tuning relative\\nrepresentations, a topological regularization loss encouraging clustering\\nwithin classes. We provide an empirical investigation on a natural language\\ntask, where both the proposed variations yield improved performance on\\nzero-shot model stitching.\",\"PeriodicalId\":501301,\"journal\":{\"name\":\"arXiv - CS - Machine Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10967\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10967","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Relative Representations: Topological and Geometric Perspectives
Relative representations are an established approach to zero-shot model
stitching, consisting of a non-trainable transformation of the latent space of
a deep neural network. Based on insights of topological and geometric nature,
we propose two improvements to relative representations. First, we introduce a
normalization procedure in the relative transformation, resulting in invariance
to non-isotropic rescalings and permutations. The latter coincides with the
symmetries in parameter space induced by common activation functions. Second,
we propose to deploy topological densification when fine-tuning relative
representations, a topological regularization loss encouraging clustering
within classes. We provide an empirical investigation on a natural language
task, where both the proposed variations yield improved performance on
zero-shot model stitching.