{"title":"Degree distributions in networks: Beyond the power law","authors":"Clement Lee, Emma F. Eastoe, Aiden Farrell","doi":"10.1111/stan.12355","DOIUrl":null,"url":null,"abstract":"The power law is useful in describing count phenomena such as network degrees and word frequencies. With a single parameter, it captures the main feature that the frequencies are linear on the log‐log scale. Nevertheless, there have been criticisms of the power law, for example, that a threshold needs to be preselected without its uncertainty quantified, that the power law is simply inadequate, and that subsequent hypothesis tests are required to determine whether the data could have come from the power law. We propose a modeling framework that combines two different generalizations of the power law, namely the generalized Pareto distribution and the Zipf‐polylog distribution, to resolve these issues. The proposed mixture distributions are shown to fit the data well and quantify the threshold uncertainty in a natural way. A model selection step embedded in the Bayesian inference algorithm further answers the question whether the power law is adequate.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"47 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Neerlandica","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/stan.12355","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
The power law is useful in describing count phenomena such as network degrees and word frequencies. With a single parameter, it captures the main feature that the frequencies are linear on the log‐log scale. Nevertheless, there have been criticisms of the power law, for example, that a threshold needs to be preselected without its uncertainty quantified, that the power law is simply inadequate, and that subsequent hypothesis tests are required to determine whether the data could have come from the power law. We propose a modeling framework that combines two different generalizations of the power law, namely the generalized Pareto distribution and the Zipf‐polylog distribution, to resolve these issues. The proposed mixture distributions are shown to fit the data well and quantify the threshold uncertainty in a natural way. A model selection step embedded in the Bayesian inference algorithm further answers the question whether the power law is adequate.
期刊介绍:
Statistica Neerlandica has been the journal of the Netherlands Society for Statistics and Operations Research since 1946. It covers all areas of statistics, from theoretical to applied, with a special emphasis on mathematical statistics, statistics for the behavioural sciences and biostatistics. This wide scope is reflected by the expertise of the journal’s editors representing these areas. The diverse editorial board is committed to a fast and fair reviewing process, and will judge submissions on quality, correctness, relevance and originality. Statistica Neerlandica encourages transparency and reproducibility, and offers online resources to make data, code, simulation results and other additional materials publicly available.