Chinnamuthu Subramani , Ravi Prasad K. Jagannath , Venkatanareshbabu Kuppili
{"title":"Randomized Gauss–Seidel iterative algorithms for Extreme Learning Machines","authors":"Chinnamuthu Subramani , Ravi Prasad K. Jagannath , Venkatanareshbabu Kuppili","doi":"10.1016/j.physa.2025.130515","DOIUrl":null,"url":null,"abstract":"<div><div>Extreme Learning Machines (ELMs) are a class of single hidden-layer feedforward neural networks known for their rapid training process, structural simplicity, and strong generalization capabilities. ELM training requires solving a system of linear equations, where solution accuracy directly impacts model performance. However, conventional ELMs rely on the Moore–Penrose inverse, which is computationally expensive, memory-intensive, and numerically unstable in ill-conditioned problems. Additionally, stabilizing matrix inversion requires a hyperparameter, whose optimal selection further increases computational complexity. Iterative numerical techniques offer a promising alternative; however, the stochastic nature of the feature matrix challenges deterministic methods, while stochastic gradient approaches are hyperparameter-sensitive and prone to local minima. To address these limitations, this study introduces randomized iterative algorithms that solve the original linear system without requiring matrix inversion or full-system computation, instead leveraging random subsets of data in a hyperparameter-free framework. Although these methods incorporate randomness, they are not arbitrary but remain system-dependent, dynamically adapting to the structure of the feature matrix. Theoretical analysis establishes upper bounds on the expected number of iterations, expressed in terms of statistical properties of the feature matrix, providing insights into near-singularity, condition number, and network size. Empirical evaluations on classification datasets demonstrate that the proposed methods consistently outperform conventional ELM, deterministic solvers, and gradient descent-based methods in accuracy, efficiency, and robustness. Statistical validation using Friedman’s rank test and Wilcoxon post-hoc analysis confirms the superior performance and reliability of these randomized algorithms, establishing them as a computationally efficient and numerically stable alternative to existing approaches.</div></div>","PeriodicalId":20152,"journal":{"name":"Physica A: Statistical Mechanics and its Applications","volume":"666 ","pages":"Article 130515"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica A: Statistical Mechanics and its Applications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378437125001670","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Extreme Learning Machines (ELMs) are a class of single hidden-layer feedforward neural networks known for their rapid training process, structural simplicity, and strong generalization capabilities. ELM training requires solving a system of linear equations, where solution accuracy directly impacts model performance. However, conventional ELMs rely on the Moore–Penrose inverse, which is computationally expensive, memory-intensive, and numerically unstable in ill-conditioned problems. Additionally, stabilizing matrix inversion requires a hyperparameter, whose optimal selection further increases computational complexity. Iterative numerical techniques offer a promising alternative; however, the stochastic nature of the feature matrix challenges deterministic methods, while stochastic gradient approaches are hyperparameter-sensitive and prone to local minima. To address these limitations, this study introduces randomized iterative algorithms that solve the original linear system without requiring matrix inversion or full-system computation, instead leveraging random subsets of data in a hyperparameter-free framework. Although these methods incorporate randomness, they are not arbitrary but remain system-dependent, dynamically adapting to the structure of the feature matrix. Theoretical analysis establishes upper bounds on the expected number of iterations, expressed in terms of statistical properties of the feature matrix, providing insights into near-singularity, condition number, and network size. Empirical evaluations on classification datasets demonstrate that the proposed methods consistently outperform conventional ELM, deterministic solvers, and gradient descent-based methods in accuracy, efficiency, and robustness. Statistical validation using Friedman’s rank test and Wilcoxon post-hoc analysis confirms the superior performance and reliability of these randomized algorithms, establishing them as a computationally efficient and numerically stable alternative to existing approaches.
期刊介绍:
Physica A: Statistical Mechanics and its Applications
Recognized by the European Physical Society
Physica A publishes research in the field of statistical mechanics and its applications.
Statistical mechanics sets out to explain the behaviour of macroscopic systems by studying the statistical properties of their microscopic constituents.
Applications of the techniques of statistical mechanics are widespread, and include: applications to physical systems such as solids, liquids and gases; applications to chemical and biological systems (colloids, interfaces, complex fluids, polymers and biopolymers, cell physics); and other interdisciplinary applications to for instance biological, economical and sociological systems.