LeanKAN: a parameter-lean Kolmogorov-Arnold network layer with improved memory efficiency and convergence behavior

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-07-19 DOI:10.1016/j.neunet.2025.107883

Benjamin C. Koenig, Suyong Kim, Sili Deng

{"title":"LeanKAN: a parameter-lean Kolmogorov-Arnold network layer with improved memory efficiency and convergence behavior","authors":"Benjamin C. Koenig, Suyong Kim, Sili Deng","doi":"10.1016/j.neunet.2025.107883","DOIUrl":null,"url":null,"abstract":"<div><div>The recently proposed Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling. While original KAN layers were only capable of representing the addition operator, the recently-proposed MultKAN layer combines addition and multiplication subnodes in an effort to improve representation performance. Here, we find that MultKAN layers suffer from a few key drawbacks including limited applicability in output layers, bulky parameterizations with extraneous activations, and the inclusion of complex hyperparameters. To address these issues, we propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers. LeanKANs address these three drawbacks of MultKAN through general applicability as output layers, significantly reduced parameter counts for a given network structure, and a smaller set of hyperparameters. As a one-to-one layer replacement for standard AddKAN and MultKAN layers, LeanKAN is able to provide these benefits to traditional KAN learning problems as well as augmented KAN structures in which it serves as the backbone, such as KAN Ordinary Differential Equations (KAN-ODEs) or Deep Operator KANs (DeepOKAN). We demonstrate LeanKAN’s simplicity and efficiency in a series of demonstrations carried out across a standard KAN toy problem as well as ordinary and partial differential equations learned via KAN-ODEs, where we find that its sparser parameterization and compact structure serve to increase its expressivity and learning capability, leading it to outperform similar and even much larger MultKANs in various tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"192 ","pages":"Article 107883"},"PeriodicalIF":6.3000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025007646","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The recently proposed Kolmogorov-Arnold network (KAN) is a promising alternative to multi-layer perceptrons (MLPs) for data-driven modeling. While original KAN layers were only capable of representing the addition operator, the recently-proposed MultKAN layer combines addition and multiplication subnodes in an effort to improve representation performance. Here, we find that MultKAN layers suffer from a few key drawbacks including limited applicability in output layers, bulky parameterizations with extraneous activations, and the inclusion of complex hyperparameters. To address these issues, we propose LeanKANs, a direct and modular replacement for MultKAN and traditional AddKAN layers. LeanKANs address these three drawbacks of MultKAN through general applicability as output layers, significantly reduced parameter counts for a given network structure, and a smaller set of hyperparameters. As a one-to-one layer replacement for standard AddKAN and MultKAN layers, LeanKAN is able to provide these benefits to traditional KAN learning problems as well as augmented KAN structures in which it serves as the backbone, such as KAN Ordinary Differential Equations (KAN-ODEs) or Deep Operator KANs (DeepOKAN). We demonstrate LeanKAN’s simplicity and efficiency in a series of demonstrations carried out across a standard KAN toy problem as well as ordinary and partial differential equations learned via KAN-ODEs, where we find that its sparser parameterization and compact structure serve to increase its expressivity and learning capability, leading it to outperform similar and even much larger MultKANs in various tasks.

查看原文本刊更多论文

LeanKAN：一个参数精益的Kolmogorov-Arnold网络层，具有改进的内存效率和收敛行为

最近提出的Kolmogorov-Arnold网络（KAN）是用于数据驱动建模的多层感知器（mlp）的一个有前途的替代方案。虽然最初的KAN层只能表示加法运算符，但最近提出的MultKAN层结合了加法和乘法子节点，以提高表示性能。在这里，我们发现MultKAN层存在一些关键的缺点，包括在输出层中的适用性有限，具有无关激活的庞大参数化，以及包含复杂的超参数。为了解决这些问题，我们提出了LeanKANs，这是MultKAN和传统AddKAN层的直接和模块化替代品。LeanKANs通过作为输出层的一般适用性，显著减少给定网络结构的参数计数以及更小的超参数集来解决MultKAN的这三个缺点。作为标准AddKAN和MultKAN层的一对一层替代品，LeanKAN能够为传统的KAN学习问题以及作为骨干的增强KAN结构（如KAN常微分方程（KAN- ode）或深度算子KAN (DeepOKAN)）提供这些好处。我们通过一系列针对标准KAN玩具问题以及通过KAN- ode学习的常微分方程和偏微分方程的演示展示了LeanKAN的简单性和效率，其中我们发现其更稀疏的参数化和紧凑的结构有助于提高其表达能力和学习能力，从而使其在各种任务中优于类似甚至更大的multkan。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.