Short communication: The Wasserstein distance as a dissimilarity metric for comparing detrital age spectra and other geological distributions

IF 2.7 Q2 GEOCHEMISTRY & GEOPHYSICS
A. Lipp, P. Vermeesch
{"title":"Short communication: The Wasserstein distance as a dissimilarity metric for comparing detrital age spectra and other geological distributions","authors":"A. Lipp, P. Vermeesch","doi":"10.5194/gchron-5-263-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Distributional data such as detrital age populations or grain size distributions are common in the geological sciences. As analytical techniques become more sophisticated, increasingly large amounts of distributional data are being gathered. These advances require quantitative and objective methods, such as multidimensional scaling (MDS), to analyse large numbers of samples. Crucial to such methods is choosing a sensible measure of dissimilarity between samples. At present, the Kolmogorov–Smirnov (KS) statistic is the most widely used of these dissimilarity measures. However, the KS statistic has some limitations such as high sensitivity to differences between the modes of two distributions and insensitivity to their tails. Here, we propose the Wasserstein-2 distance (W2) as an additional and alternative metric for use in geochronology. Whereas the KS distance is defined as the maximum vertical distance between two empirical cumulative distribution functions, the W2 distance is a function of the horizontal distances (i.e. age differences) between observations. Using a variety of synthetic and real datasets, we explore scenarios where the W2 may provide greater geological insight than the KS statistic. We find that in cases where absolute time differences are not relevant (e.g. mixing of known, discrete age peaks), the KS statistic can be more intuitive. However, in scenarios where absolute age differences are important (e.g. temporally and/or spatially evolving sources, thermochronology, and overcoming laboratory biases), W2 is preferable. The W2 distance has been added to the R package, IsoplotR, for immediate use in detrital geochronology and other applications. The W2 distance can be generalized to multiple dimensions, which opens opportunities beyond distributional data.\n","PeriodicalId":12723,"journal":{"name":"Geochronology","volume":"9 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geochronology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/gchron-5-263-2023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 2

Abstract

Abstract. Distributional data such as detrital age populations or grain size distributions are common in the geological sciences. As analytical techniques become more sophisticated, increasingly large amounts of distributional data are being gathered. These advances require quantitative and objective methods, such as multidimensional scaling (MDS), to analyse large numbers of samples. Crucial to such methods is choosing a sensible measure of dissimilarity between samples. At present, the Kolmogorov–Smirnov (KS) statistic is the most widely used of these dissimilarity measures. However, the KS statistic has some limitations such as high sensitivity to differences between the modes of two distributions and insensitivity to their tails. Here, we propose the Wasserstein-2 distance (W2) as an additional and alternative metric for use in geochronology. Whereas the KS distance is defined as the maximum vertical distance between two empirical cumulative distribution functions, the W2 distance is a function of the horizontal distances (i.e. age differences) between observations. Using a variety of synthetic and real datasets, we explore scenarios where the W2 may provide greater geological insight than the KS statistic. We find that in cases where absolute time differences are not relevant (e.g. mixing of known, discrete age peaks), the KS statistic can be more intuitive. However, in scenarios where absolute age differences are important (e.g. temporally and/or spatially evolving sources, thermochronology, and overcoming laboratory biases), W2 is preferable. The W2 distance has been added to the R package, IsoplotR, for immediate use in detrital geochronology and other applications. The W2 distance can be generalized to multiple dimensions, which opens opportunities beyond distributional data.
短通信:沃瑟斯坦距离作为比较碎屑年龄谱和其他地质分布的差异度量
摘要分布数据,如碎屑年龄群或粒度分布,在地质科学中是常见的。随着分析技术变得越来越复杂,越来越多的分布数据被收集起来。这些进步需要定量和客观的方法,如多维尺度(MDS)来分析大量样本。这种方法的关键是选择一个合理的测量样本之间的差异。目前,Kolmogorov-Smirnov (KS)统计量是这些差异度量中应用最广泛的。然而,KS统计量有一些局限性,例如对两个分布模式之间的差异高度敏感,而对它们的尾部不敏感。在这里,我们提出Wasserstein-2距离(W2)作为地质年代学中使用的附加和替代度量。KS距离定义为两个经验累积分布函数之间的最大垂直距离,而W2距离是观测值之间的水平距离(即年龄差)的函数。使用各种合成和真实数据集,我们探索了W2可能比KS统计数据提供更大地质洞察力的场景。我们发现,在绝对时间差不相关的情况下(例如,混合已知的离散年龄峰值),KS统计可以更直观。然而,在绝对年龄差异很重要的情况下(例如,时间和/或空间演化源、热年代学和克服实验室偏差),W2更可取。W2距离已添加到R包IsoplotR中,可立即用于碎屑地质年代学和其他应用。W2距离可以推广到多个维度,这开辟了超越分布数据的机会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Geochronology
Geochronology Earth and Planetary Sciences-Paleontology
CiteScore
6.60
自引率
0.00%
发文量
35
审稿时长
19 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信