Jiacheng Guo , Lei Li , Huiming Sun , Minghai Qin , Hongkai Yu , Tianyun Zhang
{"title":"A min–max optimization framework for sparse multi-task deep neural network","authors":"Jiacheng Guo , Lei Li , Huiming Sun , Minghai Qin , Hongkai Yu , Tianyun Zhang","doi":"10.1016/j.neucom.2025.130865","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-task learning is a subfield of machine learning in which the data is trained with a shared model to solve different tasks simultaneously. Instead of training multiple models, we only need to train a single model with shared parameters to solve different tasks. By sharing parameters, multi-task learning significantly decreases the number of parameters and reduces computational and storage requirements. However, when applying multi-task learning to deep neural networks, model size remains a challenge, particularly for edge platforms. Compressing multi-task models while maintaining performance across all tasks is another significant challenge. To address these issues, we propose a min–max optimization framework for highly compressed multi-task deep neural network models, combined with weight pruning or dynamic sparse training strategies to improve training efficiency by reducing model parameters. Specifically, weight pruning leverages reweighted <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> pruning method, enabling high pruning rates while preserving the performance across all tasks. Dynamic sparse training, on the other hand, initializes and updates the sparse masks of the network dynamically during the training process while maintaining the same number of weights, which typically encourages sparsity in the weight matrices with the advantage of reducing memory footprint and computational requirements. Our proposed min–max optimization framework can automatically adjust the learnable weighting factors between different tasks, ensuring optimization for the worst-performing task. Experimental results on NYUv2 and CIFAR-100 datasets demonstrate that the model incurs minor performance degradation after pruning with the min–max framework. Further analyses indicate the min–max framework has reliable performance and the difference from prior methods is statistically significant. The proposed dynamic sparse multi-task framework reaches around 2% overall precision improvement using min–max optimization compared with prior methods when the models are equally sparsed.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130865"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015371","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-task learning is a subfield of machine learning in which the data is trained with a shared model to solve different tasks simultaneously. Instead of training multiple models, we only need to train a single model with shared parameters to solve different tasks. By sharing parameters, multi-task learning significantly decreases the number of parameters and reduces computational and storage requirements. However, when applying multi-task learning to deep neural networks, model size remains a challenge, particularly for edge platforms. Compressing multi-task models while maintaining performance across all tasks is another significant challenge. To address these issues, we propose a min–max optimization framework for highly compressed multi-task deep neural network models, combined with weight pruning or dynamic sparse training strategies to improve training efficiency by reducing model parameters. Specifically, weight pruning leverages reweighted pruning method, enabling high pruning rates while preserving the performance across all tasks. Dynamic sparse training, on the other hand, initializes and updates the sparse masks of the network dynamically during the training process while maintaining the same number of weights, which typically encourages sparsity in the weight matrices with the advantage of reducing memory footprint and computational requirements. Our proposed min–max optimization framework can automatically adjust the learnable weighting factors between different tasks, ensuring optimization for the worst-performing task. Experimental results on NYUv2 and CIFAR-100 datasets demonstrate that the model incurs minor performance degradation after pruning with the min–max framework. Further analyses indicate the min–max framework has reliable performance and the difference from prior methods is statistically significant. The proposed dynamic sparse multi-task framework reaches around 2% overall precision improvement using min–max optimization compared with prior methods when the models are equally sparsed.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.