基于核的最大相关熵准则的随机梯度下降。

IF 2.1 3区 物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY
Entropy Pub Date : 2024-12-17 DOI:10.3390/e26121104
Tiankai Li, Baobin Wang, Chaoquan Peng, Hong Yin
{"title":"基于核的最大相关熵准则的随机梯度下降。","authors":"Tiankai Li, Baobin Wang, Chaoquan Peng, Hong Yin","doi":"10.3390/e26121104","DOIUrl":null,"url":null,"abstract":"<p><p>Maximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of models into consideration and belongs to the convex optimization problem, MCC captures the high-order information of models that play crucial roles in robust learning, which is usually accompanied by solving the non-convexity optimization problems. As we know, the theoretical research on convex optimizations has made significant achievements, while theoretical understandings of non-convex optimization are still far from mature. Motivated by the popularity of the stochastic gradient descent (SGD) for solving nonconvex problems, this paper considers SGD applied to the kernel version of MCC, which has been shown to be robust to outliers and non-Gaussian data in nonlinear structure models. As the existing theoretical results for the SGD algorithm applied to the kernel MCC are not well established, we present the rigorous analysis for the convergence behaviors and provide explicit convergence rates under some standard conditions. Our work can fill the gap between optimization process and convergence during the iterations: the iterates need to converge to the global minimizer while the obtained estimator cannot ensure the global optimality in the learning process.</p>","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"26 12","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11675914/pdf/","citationCount":"0","resultStr":"{\"title\":\"Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion.\",\"authors\":\"Tiankai Li, Baobin Wang, Chaoquan Peng, Hong Yin\",\"doi\":\"10.3390/e26121104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Maximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of models into consideration and belongs to the convex optimization problem, MCC captures the high-order information of models that play crucial roles in robust learning, which is usually accompanied by solving the non-convexity optimization problems. As we know, the theoretical research on convex optimizations has made significant achievements, while theoretical understandings of non-convex optimization are still far from mature. Motivated by the popularity of the stochastic gradient descent (SGD) for solving nonconvex problems, this paper considers SGD applied to the kernel version of MCC, which has been shown to be robust to outliers and non-Gaussian data in nonlinear structure models. As the existing theoretical results for the SGD algorithm applied to the kernel MCC are not well established, we present the rigorous analysis for the convergence behaviors and provide explicit convergence rates under some standard conditions. Our work can fill the gap between optimization process and convergence during the iterations: the iterates need to converge to the global minimizer while the obtained estimator cannot ensure the global optimality in the learning process.</p>\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"26 12\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11675914/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e26121104\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e26121104","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

最大熵准则(MCC)已成功地应用于各种非高斯噪声场景,成为机器学习和信号处理领域的重要方法。经典最小二乘法(LS)只考虑模型的二阶矩,属于凸优化问题,与LS相比,MCC捕获了模型的高阶信息,这些信息在鲁棒学习中起着至关重要的作用,通常伴随着求解非凸优化问题。我们知道,对凸优化的理论研究已经取得了显著的成果,而对非凸优化的理论认识还远远不够成熟。受随机梯度下降法(SGD)在求解非凸问题中的广泛应用的启发,本文考虑将SGD应用于MCC的核版本,该方法已被证明对非线性结构模型中的异常值和非高斯数据具有鲁棒性。针对SGD算法应用于核MCC的现有理论结果尚不完善的问题,本文对其收敛行为进行了严格的分析,并给出了一些标准条件下的显式收敛速率。我们的工作填补了迭代过程中优化过程和收敛过程之间的空白:迭代需要收敛到全局最小值,而得到的估计量在学习过程中不能保证全局最优性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Stochastic Gradient Descent for Kernel-Based Maximum Correntropy Criterion.

Maximum correntropy criterion (MCC) has been an important method in machine learning and signal processing communities since it was successfully applied in various non-Gaussian noise scenarios. In comparison with the classical least squares method (LS), which takes only the second-order moment of models into consideration and belongs to the convex optimization problem, MCC captures the high-order information of models that play crucial roles in robust learning, which is usually accompanied by solving the non-convexity optimization problems. As we know, the theoretical research on convex optimizations has made significant achievements, while theoretical understandings of non-convex optimization are still far from mature. Motivated by the popularity of the stochastic gradient descent (SGD) for solving nonconvex problems, this paper considers SGD applied to the kernel version of MCC, which has been shown to be robust to outliers and non-Gaussian data in nonlinear structure models. As the existing theoretical results for the SGD algorithm applied to the kernel MCC are not well established, we present the rigorous analysis for the convergence behaviors and provide explicit convergence rates under some standard conditions. Our work can fill the gap between optimization process and convergence during the iterations: the iterates need to converge to the global minimizer while the obtained estimator cannot ensure the global optimality in the learning process.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Entropy
Entropy PHYSICS, MULTIDISCIPLINARY-
CiteScore
4.90
自引率
11.10%
发文量
1580
审稿时长
21.05 days
期刊介绍: Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信