Divergence estimation for machine learning and signal processing

2013 International Winter Workshop on Brain-Computer Interface (BCI) Pub Date : 2013-04-23 DOI:10.1109/IWW-BCI.2013.6506611

Masashi Sugiyama

{"title":"Divergence estimation for machine learning and signal processing","authors":"Masashi Sugiyama","doi":"10.1109/IWW-BCI.2013.6506611","DOIUrl":null,"url":null,"abstract":"Approximating a divergence between two probability distributions from their samples is a fundamental challenge in the statistics, information theory, and machine learning communities, because a divergence estimator can be used for various purposes such as two-sample homogeneity testing, change-point detection, and class-balance estimation. Furthermore, an approximator of a divergence between the joint distribution and the product of marginals can be used for independence testing, which has a wide range of applications including feature selection and extraction, clustering, object matching, independent component analysis, and causality learning. In this talk, we review recent advances in direct divergence approximation that follow the general inference principle advocated by Vladimir Vapnik-one should not solve a more general problem as an intermediate step. More specifically, direct divergence approximation avoids separately estimating two probability distributions when approximating a divergence. We cover direct approximators of the Kullback-Leibler (KL) divergence, the Pearson (PE) divergence, the relative PE (rPE) divergence, and the L2-distance. Despite the overwhelming popularity of the KL divergence, we argue that the latter approximators are more useful in practice due to their computational efficiency, high numerical stability, and superior robustness against outliers.","PeriodicalId":129758,"journal":{"name":"2013 International Winter Workshop on Brain-Computer Interface (BCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Winter Workshop on Brain-Computer Interface (BCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWW-BCI.2013.6506611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Approximating a divergence between two probability distributions from their samples is a fundamental challenge in the statistics, information theory, and machine learning communities, because a divergence estimator can be used for various purposes such as two-sample homogeneity testing, change-point detection, and class-balance estimation. Furthermore, an approximator of a divergence between the joint distribution and the product of marginals can be used for independence testing, which has a wide range of applications including feature selection and extraction, clustering, object matching, independent component analysis, and causality learning. In this talk, we review recent advances in direct divergence approximation that follow the general inference principle advocated by Vladimir Vapnik-one should not solve a more general problem as an intermediate step. More specifically, direct divergence approximation avoids separately estimating two probability distributions when approximating a divergence. We cover direct approximators of the Kullback-Leibler (KL) divergence, the Pearson (PE) divergence, the relative PE (rPE) divergence, and the L2-distance. Despite the overwhelming popularity of the KL divergence, we argue that the latter approximators are more useful in practice due to their computational efficiency, high numerical stability, and superior robustness against outliers.

查看原文本刊更多论文

散度估计用于机器学习和信号处理

在统计学、信息论和机器学习社区中，从样本近似两个概率分布之间的散度是一个基本的挑战，因为散度估计器可以用于各种目的，如双样本同质性测试、变化点检测和类平衡估计。此外，联合分布与边际积之间的散度近似器可用于独立性测试，它具有广泛的应用范围，包括特征选择和提取，聚类，对象匹配，独立成分分析和因果关系学习。在这次演讲中，我们回顾了直接散度近似的最新进展，这些进展遵循弗拉基米尔·瓦普尼克(Vladimir vapnik)倡导的一般推理原则——不应该把解决更一般的问题作为中间步骤。更具体地说，直接散度近似避免了在近似散度时分别估计两个概率分布。我们涵盖了Kullback-Leibler (KL)散度、Pearson (PE)散度、相对PE (rPE)散度和l2距离的直接近似。尽管KL散度非常受欢迎，但我们认为后一种近似器在实践中更有用，因为它们具有计算效率、高数值稳定性和对异常值的优越鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 International Winter Workshop on Brain-Computer Interface (BCI)

自引率

0.00%

发文量