Hui Zhang , Shenglong Zhou , Geoffrey Ye Li , Naihua Xiu , Yiju Wang
{"title":"基于阶跃函数的 0/1 深度神经网络递归方法","authors":"Hui Zhang , Shenglong Zhou , Geoffrey Ye Li , Naihua Xiu , Yiju Wang","doi":"10.1016/j.amc.2024.129129","DOIUrl":null,"url":null,"abstract":"<div><div>The deep neural network with step function activation (0/1 DNNs) is a fundamental composite model in deep learning which has high efficiency and robustness to outliers. However, due to the discontinuity and lacking subgradient information of the 0/1 DNNs model, prior researches are largely focused on designing continuous functions to approximate the step activation and developing continuous optimization methods. In this paper, by introducing two sets of network node variables into the 0/1 DNNs and by exploring the composite structure of the resulted model, the 0/1 DNNs is decomposed into a unary optimization model associated with the step function and three derivational optimization subproblems associated with the other variables. For the unary optimization model and two derivational optimization subproblems, we present a closed form solution, and for the third derivational optimization subproblem, we propose an efficient proximal method. Based on this, a globally convergent step function based recursion method for the 0/1 DNNs is developed. The efficiency and performance of the proposed algorithm are validated via theoretical analysis as well as some illustrative numerical examples on classifying MNIST, FashionMNIST and Cifar10 datasets.</div></div>","PeriodicalId":55496,"journal":{"name":"Applied Mathematics and Computation","volume":"488 ","pages":"Article 129129"},"PeriodicalIF":3.5000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A step function based recursion method for 0/1 deep neural networks\",\"authors\":\"Hui Zhang , Shenglong Zhou , Geoffrey Ye Li , Naihua Xiu , Yiju Wang\",\"doi\":\"10.1016/j.amc.2024.129129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The deep neural network with step function activation (0/1 DNNs) is a fundamental composite model in deep learning which has high efficiency and robustness to outliers. However, due to the discontinuity and lacking subgradient information of the 0/1 DNNs model, prior researches are largely focused on designing continuous functions to approximate the step activation and developing continuous optimization methods. In this paper, by introducing two sets of network node variables into the 0/1 DNNs and by exploring the composite structure of the resulted model, the 0/1 DNNs is decomposed into a unary optimization model associated with the step function and three derivational optimization subproblems associated with the other variables. For the unary optimization model and two derivational optimization subproblems, we present a closed form solution, and for the third derivational optimization subproblem, we propose an efficient proximal method. Based on this, a globally convergent step function based recursion method for the 0/1 DNNs is developed. The efficiency and performance of the proposed algorithm are validated via theoretical analysis as well as some illustrative numerical examples on classifying MNIST, FashionMNIST and Cifar10 datasets.</div></div>\",\"PeriodicalId\":55496,\"journal\":{\"name\":\"Applied Mathematics and Computation\",\"volume\":\"488 \",\"pages\":\"Article 129129\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Mathematics and Computation\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0096300324005903\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematics and Computation","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0096300324005903","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
A step function based recursion method for 0/1 deep neural networks
The deep neural network with step function activation (0/1 DNNs) is a fundamental composite model in deep learning which has high efficiency and robustness to outliers. However, due to the discontinuity and lacking subgradient information of the 0/1 DNNs model, prior researches are largely focused on designing continuous functions to approximate the step activation and developing continuous optimization methods. In this paper, by introducing two sets of network node variables into the 0/1 DNNs and by exploring the composite structure of the resulted model, the 0/1 DNNs is decomposed into a unary optimization model associated with the step function and three derivational optimization subproblems associated with the other variables. For the unary optimization model and two derivational optimization subproblems, we present a closed form solution, and for the third derivational optimization subproblem, we propose an efficient proximal method. Based on this, a globally convergent step function based recursion method for the 0/1 DNNs is developed. The efficiency and performance of the proposed algorithm are validated via theoretical analysis as well as some illustrative numerical examples on classifying MNIST, FashionMNIST and Cifar10 datasets.
期刊介绍:
Applied Mathematics and Computation addresses work at the interface between applied mathematics, numerical computation, and applications of systems – oriented ideas to the physical, biological, social, and behavioral sciences, and emphasizes papers of a computational nature focusing on new algorithms, their analysis and numerical results.
In addition to presenting research papers, Applied Mathematics and Computation publishes review articles and single–topics issues.