Transfer Learning With Reconstruction Loss

IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2024-04-02 DOI:10.1109/TMLCN.2024.3384329

Wei Cui;Wei Yu

{"title":"Transfer Learning With Reconstruction Loss","authors":"Wei Cui;Wei Yu","doi":"10.1109/TMLCN.2024.3384329","DOIUrl":null,"url":null,"abstract":"In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often need to be optimized on the same set of problem inputs. Instead of independently training a different neural network for each problem separately, it would be more efficient to exploit the correlations between these objectives and to train multiple neural network models with shared model parameters and feature representations. To achieve this, this paper first establishes the concept of common information: the shared knowledge required for solving the correlated tasks, then proposes a novel approach for model training by adding into the model an additional reconstruction stage associated with a new reconstruction loss. This loss is for reconstructing the common information starting from a selected hidden layer in the model. The proposed approach encourages the learned features to be general and transferable, and therefore can be readily used for efficient transfer learning. For numerical simulations, three applications are studied: transfer learning on classifying MNIST handwritten digits, the device-to-device wireless network power allocation, and the multiple-input-single-output network downlink beamforming and localization. Simulation results suggest that the proposed approach is highly efficient in data and model complexity, is resilient to over-fitting, and has competitive performances.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"407-423"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10488445","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10488445/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often need to be optimized on the same set of problem inputs. Instead of independently training a different neural network for each problem separately, it would be more efficient to exploit the correlations between these objectives and to train multiple neural network models with shared model parameters and feature representations. To achieve this, this paper first establishes the concept of common information: the shared knowledge required for solving the correlated tasks, then proposes a novel approach for model training by adding into the model an additional reconstruction stage associated with a new reconstruction loss. This loss is for reconstructing the common information starting from a selected hidden layer in the model. The proposed approach encourages the learned features to be general and transferable, and therefore can be readily used for efficient transfer learning. For numerical simulations, three applications are studied: transfer learning on classifying MNIST handwritten digits, the device-to-device wireless network power allocation, and the multiple-input-single-output network downlink beamforming and localization. Simulation results suggest that the proposed approach is highly efficient in data and model complexity, is resilient to over-fitting, and has competitive performances.

查看原文本刊更多论文

有重建损失的迁移学习

在利用神经网络进行数学优化的大多数应用中，每个特定的优化目标都要训练一个专门的模型。然而，在许多情况下，往往需要在同一组问题输入上优化多个不同但相互关联的目标或任务。与其为每个问题分别独立地训练不同的神经网络，不如利用这些目标之间的相关性，训练具有共享模型参数和特征表示的多个神经网络模型，这样会更有效。为此，本文首先建立了共同信息的概念：解决相关任务所需的共享知识，然后提出了一种新的模型训练方法，即在模型中加入一个与新的重建损失相关的额外重建阶段。该损失用于从模型中选定的隐藏层开始重建共同信息。所提出的方法促使学习到的特征具有通用性和可迁移性，因此可随时用于高效的迁移学习。在数值模拟方面，研究了三个应用：MNIST 手写数字分类的迁移学习、设备到设备无线网络功率分配以及多输入单输出网络下行波束成形和定位。仿真结果表明，所提出的方法在数据和模型复杂度方面具有很高的效率，能够抵御过度拟合，并具有极具竞争力的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Machine Learning in Communications and Networking

自引率

0.00%

发文量