Nikhil Khatri, Tuomas Laakkonen, Jonathon Liu, Vincent Wang-Maścianica
{"title":"On the Anatomy of Attention","authors":"Nikhil Khatri, Tuomas Laakkonen, Jonathon Liu, Vincent Wang-Maścianica","doi":"arxiv-2407.02423","DOIUrl":null,"url":null,"abstract":"We introduce a category-theoretic diagrammatic formalism in order to\nsystematically relate and reason about machine learning models. Our diagrams\npresent architectures intuitively but without loss of essential detail, where\nnatural relationships between models are captured by graphical transformations,\nand important differences and similarities can be identified at a glance. In\nthis paper, we focus on attention mechanisms: translating folklore into\nmathematical derivations, and constructing a taxonomy of attention variants in\nthe literature. As a first example of an empirical investigation underpinned by\nour formalism, we identify recurring anatomical components of attention, which\nwe exhaustively recombine to explore a space of variations on the attention\nmechanism.","PeriodicalId":501135,"journal":{"name":"arXiv - MATH - Category Theory","volume":"2013 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Category Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.02423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We introduce a category-theoretic diagrammatic formalism in order to
systematically relate and reason about machine learning models. Our diagrams
present architectures intuitively but without loss of essential detail, where
natural relationships between models are captured by graphical transformations,
and important differences and similarities can be identified at a glance. In
this paper, we focus on attention mechanisms: translating folklore into
mathematical derivations, and constructing a taxonomy of attention variants in
the literature. As a first example of an empirical investigation underpinned by
our formalism, we identify recurring anatomical components of attention, which
we exhaustively recombine to explore a space of variations on the attention
mechanism.