{"title":"Mallows在一些多元方法中的L2距离及其在直方图型数据中的应用","authors":"Katarina Ko, L. Billard","doi":"10.51936/polr7329","DOIUrl":null,"url":null,"abstract":"Mallows' L2 distance allows for decomposition of total inertia into within and between inertia according to Huygens theorem. It can be decomposed into three terms: the location term, the spread term and the shape term; a simple and straightforward proof of this theorem is presented. These characteristics are very helpful in the interpretation of the results for some distance-based methods, such as clustering by k-means and classical multidimensional scaling. For histogram-type data, Mallows' L2 distance is preferable because its calculation is simple, even when the number and length of the histograms' subintervals differ. An illustration of its use on population pyramids for 14 East European countries in the period 1995–2015 is presented. The results provide an insight into the information that this distance can extract from a complex dataset.","PeriodicalId":242585,"journal":{"name":"Advances in Methodology and Statistics","volume":"405 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Mallows' L2 distance in some multivariate methods and its application to histogram-type data\",\"authors\":\"Katarina Ko, L. Billard\",\"doi\":\"10.51936/polr7329\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mallows' L2 distance allows for decomposition of total inertia into within and between inertia according to Huygens theorem. It can be decomposed into three terms: the location term, the spread term and the shape term; a simple and straightforward proof of this theorem is presented. These characteristics are very helpful in the interpretation of the results for some distance-based methods, such as clustering by k-means and classical multidimensional scaling. For histogram-type data, Mallows' L2 distance is preferable because its calculation is simple, even when the number and length of the histograms' subintervals differ. An illustration of its use on population pyramids for 14 East European countries in the period 1995–2015 is presented. The results provide an insight into the information that this distance can extract from a complex dataset.\",\"PeriodicalId\":242585,\"journal\":{\"name\":\"Advances in Methodology and Statistics\",\"volume\":\"405 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Methodology and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.51936/polr7329\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Methodology and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51936/polr7329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mallows' L2 distance in some multivariate methods and its application to histogram-type data
Mallows' L2 distance allows for decomposition of total inertia into within and between inertia according to Huygens theorem. It can be decomposed into three terms: the location term, the spread term and the shape term; a simple and straightforward proof of this theorem is presented. These characteristics are very helpful in the interpretation of the results for some distance-based methods, such as clustering by k-means and classical multidimensional scaling. For histogram-type data, Mallows' L2 distance is preferable because its calculation is simple, even when the number and length of the histograms' subintervals differ. An illustration of its use on population pyramids for 14 East European countries in the period 1995–2015 is presented. The results provide an insight into the information that this distance can extract from a complex dataset.