{"title":"On the comparison of diversity of parts of a distribution","authors":"R. Rajaram, N. Ritchey, B. Castellani","doi":"10.1088/2399-6528/ace952","DOIUrl":null,"url":null,"abstract":"The literature on diversity measures, regardless of the metric used (e.g., Gini-Simpson index, Shannon entropy) has a notable gap: not much has been done to connect these measures back to the shape of the original distribution, or to use them to compare the diversity of parts of a given distribution and their relationship to the diversity of the whole distribution. As such, the precise quantification of the relationship between the probability of each type p i and the diversity D in non-uniform distributions, both among parts of a distribution as well as the whole, remains unresolved. This is particularly true for Hill numbers, despite their usefulness as ‘effective numbers’. This gap is problematic as most real-world systems (e.g., income distributions, economic complexity indices, rankings, ecological systems) have unequal distributions, varying frequencies, and comprise multiple diversity types with unknown frequencies that can change. To address this issue, we connect case-based entropy, an approach to diversity we developed, to the shape of a probability distribution; allowing us to show that the original probability distribution g 1, the case-based entropy curve g 2 and the c {1,k} versus the c{1,k}*lnA{1,k} curve g 3, which we call the slope of diversity, are one-to-one (or injective), i.e., a different probability distribution g 1 gives a different curve for g 2 and g 3. Hence, a different permutation of the original probability distribution g 1(that leads to a different shape) will uniquely determine the graphs g 2 and g 3. By proving the injective nature of our approach, we will have established a unique way to measure the degree of uniformity of parts as measured by D P /c P for a given part P of the original probability distribution, and also have shown a unique way to compute the D P /c P for various shapes of the original distribution and (in terms of comparison) for different curves.","PeriodicalId":47089,"journal":{"name":"Journal of Physics Communications","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2399-6528/ace952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The literature on diversity measures, regardless of the metric used (e.g., Gini-Simpson index, Shannon entropy) has a notable gap: not much has been done to connect these measures back to the shape of the original distribution, or to use them to compare the diversity of parts of a given distribution and their relationship to the diversity of the whole distribution. As such, the precise quantification of the relationship between the probability of each type p i and the diversity D in non-uniform distributions, both among parts of a distribution as well as the whole, remains unresolved. This is particularly true for Hill numbers, despite their usefulness as ‘effective numbers’. This gap is problematic as most real-world systems (e.g., income distributions, economic complexity indices, rankings, ecological systems) have unequal distributions, varying frequencies, and comprise multiple diversity types with unknown frequencies that can change. To address this issue, we connect case-based entropy, an approach to diversity we developed, to the shape of a probability distribution; allowing us to show that the original probability distribution g 1, the case-based entropy curve g 2 and the c {1,k} versus the c{1,k}*lnA{1,k} curve g 3, which we call the slope of diversity, are one-to-one (or injective), i.e., a different probability distribution g 1 gives a different curve for g 2 and g 3. Hence, a different permutation of the original probability distribution g 1(that leads to a different shape) will uniquely determine the graphs g 2 and g 3. By proving the injective nature of our approach, we will have established a unique way to measure the degree of uniformity of parts as measured by D P /c P for a given part P of the original probability distribution, and also have shown a unique way to compute the D P /c P for various shapes of the original distribution and (in terms of comparison) for different curves.