{"title":"通过学习动力了解合成映射的简单性偏差","authors":"Yi Ren, Danica J. Sutherland","doi":"arxiv-2409.09626","DOIUrl":null,"url":null,"abstract":"Obtaining compositional mappings is important for the model to generalize\nwell compositionally. To better understand when and how to encourage the model\nto learn such mappings, we study their uniqueness through different\nperspectives. Specifically, we first show that the compositional mappings are\nthe simplest bijections through the lens of coding length (i.e., an upper bound\nof their Kolmogorov complexity). This property explains why models having such\nmappings can generalize well. We further show that the simplicity bias is\nusually an intrinsic property of neural network training via gradient descent.\nThat partially explains why some models spontaneously generalize well when they\nare trained appropriately.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Understanding Simplicity Bias towards Compositional Mappings via Learning Dynamics\",\"authors\":\"Yi Ren, Danica J. Sutherland\",\"doi\":\"arxiv-2409.09626\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Obtaining compositional mappings is important for the model to generalize\\nwell compositionally. To better understand when and how to encourage the model\\nto learn such mappings, we study their uniqueness through different\\nperspectives. Specifically, we first show that the compositional mappings are\\nthe simplest bijections through the lens of coding length (i.e., an upper bound\\nof their Kolmogorov complexity). This property explains why models having such\\nmappings can generalize well. We further show that the simplicity bias is\\nusually an intrinsic property of neural network training via gradient descent.\\nThat partially explains why some models spontaneously generalize well when they\\nare trained appropriately.\",\"PeriodicalId\":501340,\"journal\":{\"name\":\"arXiv - STAT - Machine Learning\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09626\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09626","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding Simplicity Bias towards Compositional Mappings via Learning Dynamics
Obtaining compositional mappings is important for the model to generalize
well compositionally. To better understand when and how to encourage the model
to learn such mappings, we study their uniqueness through different
perspectives. Specifically, we first show that the compositional mappings are
the simplest bijections through the lens of coding length (i.e., an upper bound
of their Kolmogorov complexity). This property explains why models having such
mappings can generalize well. We further show that the simplicity bias is
usually an intrinsic property of neural network training via gradient descent.
That partially explains why some models spontaneously generalize well when they
are trained appropriately.