Distribution-on-distribution regression with Wasserstein metric: Multivariate Gaussian case

IF 1.4 3区 数学 Q2 STATISTICS & PROBABILITY
Ryo Okano , Masaaki Imaizumi
{"title":"Distribution-on-distribution regression with Wasserstein metric: Multivariate Gaussian case","authors":"Ryo Okano ,&nbsp;Masaaki Imaizumi","doi":"10.1016/j.jmva.2024.105334","DOIUrl":null,"url":null,"abstract":"<div><p>Distribution data refer to a data set in which each sample is represented as a probability distribution, a subject area that has received increasing interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce models for regression from one Gaussian distribution to another, using the Wasserstein metric. These models are constructed using the geometry of the Wasserstein space, which enables the transformation of Gaussian distributions into components of a linear matrix space. Owing to their linear regression frameworks, our models are intuitively understandable, and their implementation is simplified because of the optimal transport problem’s analytical solution between Gaussian distributions. We also explore a generalization of our models to encompass non-Gaussian scenarios. We establish the convergence rates of in-sample prediction errors for the empirical risk minimizations in our models. In comparative simulation experiments, our models demonstrate superior performance over a simpler alternative method that transforms Gaussian distributions into matrices. We present an application of our methodology using weather data for illustration purposes.</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"203 ","pages":"Article 105334"},"PeriodicalIF":1.4000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0047259X24000411/pdfft?md5=dea43975f3758fd74adfc88e822be366&pid=1-s2.0-S0047259X24000411-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X24000411","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Distribution data refer to a data set in which each sample is represented as a probability distribution, a subject area that has received increasing interest in the field of statistics. Although several studies have developed distribution-to-distribution regression models for univariate variables, the multivariate scenario remains under-explored due to technical complexities. In this study, we introduce models for regression from one Gaussian distribution to another, using the Wasserstein metric. These models are constructed using the geometry of the Wasserstein space, which enables the transformation of Gaussian distributions into components of a linear matrix space. Owing to their linear regression frameworks, our models are intuitively understandable, and their implementation is simplified because of the optimal transport problem’s analytical solution between Gaussian distributions. We also explore a generalization of our models to encompass non-Gaussian scenarios. We establish the convergence rates of in-sample prediction errors for the empirical risk minimizations in our models. In comparative simulation experiments, our models demonstrate superior performance over a simpler alternative method that transforms Gaussian distributions into matrices. We present an application of our methodology using weather data for illustration purposes.

使用 Wasserstein 度量的分布对分布回归:多变量高斯情况
分布数据是指每个样本都表示为概率分布的数据集,这是统计学领域越来越受关注的一个主题领域。尽管已有多项研究针对单变量建立了分布到分布的回归模型,但由于技术复杂性,对多变量情况的研究仍然不足。在本研究中,我们使用 Wasserstein 度量引入了从一个高斯分布到另一个高斯分布的回归模型。这些模型是利用瓦瑟斯坦空间的几何结构构建的,它能将高斯分布转化为线性矩阵空间的分量。由于采用了线性回归框架,我们的模型直观易懂,而且由于高斯分布之间的最优传输问题有了解析解,模型的实现也得到了简化。我们还探索了模型的一般化,以涵盖非高斯情况。我们确定了模型中经验风险最小化的样本内预测误差收敛率。在比较模拟实验中,与将高斯分布转换为矩阵的更简单替代方法相比,我们的模型表现出更优越的性能。我们介绍了我们的方法在天气数据中的应用,以作说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Multivariate Analysis
Journal of Multivariate Analysis 数学-统计学与概率论
CiteScore
2.40
自引率
25.00%
发文量
108
审稿时长
74 days
期刊介绍: Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data. The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of Copula modeling Functional data analysis Graphical modeling High-dimensional data analysis Image analysis Multivariate extreme-value theory Sparse modeling Spatial statistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信