Mingfei Lu , Shujian Yu , Robert Jenssen , Badong Chen
{"title":"广义Cauchy-Schwarz散度:有效估计及其在深度学习中的应用","authors":"Mingfei Lu , Shujian Yu , Robert Jenssen , Badong Chen","doi":"10.1016/j.neucom.2025.130904","DOIUrl":null,"url":null,"abstract":"<div><div>Divergence measures play a fundamental role in machine learning and deep learning; however, efficient methods for handling multiple distributions (i.e., more than two) remain largely underexplored. This challenge is particularly critical in scenarios where managing multiple distributions simultaneously is both necessary and unavoidable, such as clustering, multi-source domain adaptation, and multi-view learning. A common approach to quantifying overall divergence involves computing the mean pairwise distances between distributions. However, this method suffers from two key limitations. First, it is restricted to pairwise comparisons and fails to capture higher-order interactions or dependencies among three or more distributions. Second, its implementation requires a double-loop traversal over all distribution pairs, leading to significant computational overhead, particularly when dealing with a large number of distributions. In this study, we introduce the generalized Cauchy–Schwarz divergence (GCSD), a novel divergence measure specifically designed for multiple distributions. To facilitate its practical application, we propose a kernel-based closed-form sample estimator, which enables efficient computation in various deep-learning contexts. Furthermore, we validate GCSD through two representative tasks: deep clustering, achieved by maximizing the generalized divergence between clusters, and multi-source domain adaptation, achieved by minimizing the generalized discrepancy among feature distributions. Extensive experimental evaluations highlight the robustness and effectiveness of GCSD in both tasks, underscoring its potential to advance machine learning techniques that require the quantification of multiple distributions.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 130904"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized Cauchy–Schwarz divergence: Efficient estimation and applications in deep learning\",\"authors\":\"Mingfei Lu , Shujian Yu , Robert Jenssen , Badong Chen\",\"doi\":\"10.1016/j.neucom.2025.130904\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Divergence measures play a fundamental role in machine learning and deep learning; however, efficient methods for handling multiple distributions (i.e., more than two) remain largely underexplored. This challenge is particularly critical in scenarios where managing multiple distributions simultaneously is both necessary and unavoidable, such as clustering, multi-source domain adaptation, and multi-view learning. A common approach to quantifying overall divergence involves computing the mean pairwise distances between distributions. However, this method suffers from two key limitations. First, it is restricted to pairwise comparisons and fails to capture higher-order interactions or dependencies among three or more distributions. Second, its implementation requires a double-loop traversal over all distribution pairs, leading to significant computational overhead, particularly when dealing with a large number of distributions. In this study, we introduce the generalized Cauchy–Schwarz divergence (GCSD), a novel divergence measure specifically designed for multiple distributions. To facilitate its practical application, we propose a kernel-based closed-form sample estimator, which enables efficient computation in various deep-learning contexts. Furthermore, we validate GCSD through two representative tasks: deep clustering, achieved by maximizing the generalized divergence between clusters, and multi-source domain adaptation, achieved by minimizing the generalized discrepancy among feature distributions. Extensive experimental evaluations highlight the robustness and effectiveness of GCSD in both tasks, underscoring its potential to advance machine learning techniques that require the quantification of multiple distributions.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"652 \",\"pages\":\"Article 130904\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225015760\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015760","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Generalized Cauchy–Schwarz divergence: Efficient estimation and applications in deep learning
Divergence measures play a fundamental role in machine learning and deep learning; however, efficient methods for handling multiple distributions (i.e., more than two) remain largely underexplored. This challenge is particularly critical in scenarios where managing multiple distributions simultaneously is both necessary and unavoidable, such as clustering, multi-source domain adaptation, and multi-view learning. A common approach to quantifying overall divergence involves computing the mean pairwise distances between distributions. However, this method suffers from two key limitations. First, it is restricted to pairwise comparisons and fails to capture higher-order interactions or dependencies among three or more distributions. Second, its implementation requires a double-loop traversal over all distribution pairs, leading to significant computational overhead, particularly when dealing with a large number of distributions. In this study, we introduce the generalized Cauchy–Schwarz divergence (GCSD), a novel divergence measure specifically designed for multiple distributions. To facilitate its practical application, we propose a kernel-based closed-form sample estimator, which enables efficient computation in various deep-learning contexts. Furthermore, we validate GCSD through two representative tasks: deep clustering, achieved by maximizing the generalized divergence between clusters, and multi-source domain adaptation, achieved by minimizing the generalized discrepancy among feature distributions. Extensive experimental evaluations highlight the robustness and effectiveness of GCSD in both tasks, underscoring its potential to advance machine learning techniques that require the quantification of multiple distributions.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.