用詹森-香农发散法提高深度神经网络性能的随机聚焦法确保架构一致性

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters Pub Date : 2024-06-17 DOI:10.1007/s11063-024-11668-z

Wonjik Kim

{"title":"用詹森-香农发散法提高深度神经网络性能的随机聚焦法确保架构一致性","authors":"Wonjik Kim","doi":"10.1007/s11063-024-11668-z","DOIUrl":null,"url":null,"abstract":"<p>Multiple hidden layers in deep neural networks perform non-linear transformations, enabling the extraction of meaningful features and the identification of relationships between input and output data. However, the gap between the training and real-world data can result in network overfitting, prompting the exploration of various preventive methods. The regularization technique called ’dropout’ is widely used for deep learning models to improve the training of robust and generalized features. During the training phase with dropout, neurons in a particular layer are randomly selected to be ignored for each input. This random exclusion of neurons encourages the network to depend on different subsets of neurons at different times, fostering robustness and reducing sensitivity to specific neurons. This study introduces a novel approach called random focusing, departing from complete neuron exclusion in dropout. The proposed random focusing selectively highlights random neurons during training, aiming for a smoother transition between training and inference phases while keeping network architecture consistent. This study also incorporates Jensen–Shannon Divergence to enhance the stability and efficacy of the random focusing method. Experimental validation across tasks like image classification and semantic segmentation demonstrates the adaptability of the proposed methods across different network architectures, including convolutional neural networks and transformers.\n</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"1 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency\",\"authors\":\"Wonjik Kim\",\"doi\":\"10.1007/s11063-024-11668-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Multiple hidden layers in deep neural networks perform non-linear transformations, enabling the extraction of meaningful features and the identification of relationships between input and output data. However, the gap between the training and real-world data can result in network overfitting, prompting the exploration of various preventive methods. The regularization technique called ’dropout’ is widely used for deep learning models to improve the training of robust and generalized features. During the training phase with dropout, neurons in a particular layer are randomly selected to be ignored for each input. This random exclusion of neurons encourages the network to depend on different subsets of neurons at different times, fostering robustness and reducing sensitivity to specific neurons. This study introduces a novel approach called random focusing, departing from complete neuron exclusion in dropout. The proposed random focusing selectively highlights random neurons during training, aiming for a smoother transition between training and inference phases while keeping network architecture consistent. This study also incorporates Jensen–Shannon Divergence to enhance the stability and efficacy of the random focusing method. Experimental validation across tasks like image classification and semantic segmentation demonstrates the adaptability of the proposed methods across different network architectures, including convolutional neural networks and transformers.\\n</p>\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11668-z\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11668-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络中的多个隐藏层可执行非线性变换，从而提取有意义的特征并识别输入和输出数据之间的关系。然而，训练数据与真实世界数据之间的差距可能会导致网络过拟合，这促使人们探索各种预防方法。被称为 "dropout "的正则化技术被广泛应用于深度学习模型，以改善鲁棒性和泛化特征的训练。在使用 "剔除 "的训练阶段，特定层中的神经元会被随机选择，以忽略每个输入。这种随机排除神经元的方法鼓励网络在不同时间依赖不同的神经元子集，从而提高鲁棒性，降低对特定神经元的敏感性。本研究引入了一种称为随机聚焦的新方法，它不同于在滤波中完全排除神经元。建议的随机聚焦在训练过程中选择性地突出随机神经元，目的是在训练和推理阶段之间实现更平滑的过渡，同时保持网络架构的一致性。本研究还结合了詹森-香农发散法，以增强随机聚焦方法的稳定性和有效性。在图像分类和语义分割等任务中进行的实验验证表明，所提出的方法可以适应不同的网络架构，包括卷积神经网络和变压器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency

查看原文本刊更多论文

A Random Focusing Method with Jensen–Shannon Divergence for Improving Deep Neural Network Performance Ensuring Architecture Consistency

Multiple hidden layers in deep neural networks perform non-linear transformations, enabling the extraction of meaningful features and the identification of relationships between input and output data. However, the gap between the training and real-world data can result in network overfitting, prompting the exploration of various preventive methods. The regularization technique called ’dropout’ is widely used for deep learning models to improve the training of robust and generalized features. During the training phase with dropout, neurons in a particular layer are randomly selected to be ignored for each input. This random exclusion of neurons encourages the network to depend on different subsets of neurons at different times, fostering robustness and reducing sensitivity to specific neurons. This study introduces a novel approach called random focusing, departing from complete neuron exclusion in dropout. The proposed random focusing selectively highlights random neurons during training, aiming for a smoother transition between training and inference phases while keeping network architecture consistent. This study also incorporates Jensen–Shannon Divergence to enhance the stability and efficacy of the random focusing method. Experimental validation across tasks like image classification and semantic segmentation demonstrates the adaptability of the proposed methods across different network architectures, including convolutional neural networks and transformers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters

用詹森-香农发散法提高深度神经网络性能的随机聚焦法 确保架构一致性

摘要

用詹森-香农发散法提高深度神经网络性能的随机聚焦法确保架构一致性