{"title":"Multimedia Analysis and Fusion via Wasserstein Barycenter","authors":"Cong Jin, Junhao Wang, Jin Wei, Lifeng Tan, Shouxun Liu, Wei Zhao, Shan Liu, Xin Lv","doi":"10.2991/ijndc.k.200217.001","DOIUrl":null,"url":null,"abstract":"Many multimedia analysis algorithms rely on probability distributions that characterize audio or image features as generally high dimensions. For example, music analysis methods, such as automatic music transcription (AMT) [1] and music classification [2], in these applications, having sufficient similarity (or equivalent difference) between distributions becomes crucial. The classical distance or difference of probability density includes Kullback Leibler divergence, Kolmogorov distance, Bhattacharyya distance (also known as Hellinger distance), etc. Recently, the framework of optimal transportation and Wasserstein distance [3] are also called earth mover’s distance (EMD) [4], which has aroused great interest in computer vision [5], machine learning [6] and data fusion. Wasserstein distance calculates the best warped starter to map the measure m to the second n for a given input probability. Optimality corresponds to a loss function that measures the predicted value of the displacement in the warped starter. Generally, considering the accumulation of m and n, Wasserstein distance calculates the definition of the displacement of every particle from traces of its mass to the displacement of m to n.","PeriodicalId":318936,"journal":{"name":"Int. J. Networked Distributed Comput.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Networked Distributed Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2991/ijndc.k.200217.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Many multimedia analysis algorithms rely on probability distributions that characterize audio or image features as generally high dimensions. For example, music analysis methods, such as automatic music transcription (AMT) [1] and music classification [2], in these applications, having sufficient similarity (or equivalent difference) between distributions becomes crucial. The classical distance or difference of probability density includes Kullback Leibler divergence, Kolmogorov distance, Bhattacharyya distance (also known as Hellinger distance), etc. Recently, the framework of optimal transportation and Wasserstein distance [3] are also called earth mover’s distance (EMD) [4], which has aroused great interest in computer vision [5], machine learning [6] and data fusion. Wasserstein distance calculates the best warped starter to map the measure m to the second n for a given input probability. Optimality corresponds to a loss function that measures the predicted value of the displacement in the warped starter. Generally, considering the accumulation of m and n, Wasserstein distance calculates the definition of the displacement of every particle from traces of its mass to the displacement of m to n.