{"title":"X-distribution:复杂网络的可追溯幂律指数","authors":"Pradumn Kumar Pandey, Aikta Arya, Akrati Saxena","doi":"10.1145/3639413","DOIUrl":null,"url":null,"abstract":"<p>Network modeling has been explored extensively by means of theoretical analysis as well as numerical simulations for Network Reconstruction (NR). The network reconstruction problem requires the estimation of the power-law exponent (<i>γ</i>) of a given input network. Thus, the effectiveness of the NR solution depends on the accuracy of the calculation of <i>γ</i>. In this article, we re-examine the degree distribution-based estimation of <i>γ</i>, which is not very accurate due to approximations. We propose <b>X</b>-distribution, which is more accurate as compared to degree distribution. Various state-of-the-art network models, including CPM, NRM, RefOrCite2, BA, CDPAM, and DMS, are considered for simulation purposes, and simulated results support the proposed claim. Further, we apply <b>X</b>-distribution over several real-world networks to calculate their power-law exponents, which differ from those calculated using respective degree distributions. It is observed that <b>X</b>-distributions exhibit more linearity (straight line) on the log-log scale as compared to degree distributions. Thus, <b>X</b>-distribution is more suitable for the evaluation of power-law exponent using linear fitting (on the log-log scale). The MATLAB implementation of power-law exponent (<i>γ</i>) calculation using <b>X</b>-distribution for different network models, and the real-world datasets used in our experiments are available here: https://github.com/Aikta-Arya/X-distribution-Retraceable-Power-Law-Exponent-of-Complex-Networks.git\n</p>","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"6 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"X-distribution: Retraceable Power-Law Exponent of Complex Networks\",\"authors\":\"Pradumn Kumar Pandey, Aikta Arya, Akrati Saxena\",\"doi\":\"10.1145/3639413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Network modeling has been explored extensively by means of theoretical analysis as well as numerical simulations for Network Reconstruction (NR). The network reconstruction problem requires the estimation of the power-law exponent (<i>γ</i>) of a given input network. Thus, the effectiveness of the NR solution depends on the accuracy of the calculation of <i>γ</i>. In this article, we re-examine the degree distribution-based estimation of <i>γ</i>, which is not very accurate due to approximations. We propose <b>X</b>-distribution, which is more accurate as compared to degree distribution. Various state-of-the-art network models, including CPM, NRM, RefOrCite2, BA, CDPAM, and DMS, are considered for simulation purposes, and simulated results support the proposed claim. Further, we apply <b>X</b>-distribution over several real-world networks to calculate their power-law exponents, which differ from those calculated using respective degree distributions. It is observed that <b>X</b>-distributions exhibit more linearity (straight line) on the log-log scale as compared to degree distributions. Thus, <b>X</b>-distribution is more suitable for the evaluation of power-law exponent using linear fitting (on the log-log scale). The MATLAB implementation of power-law exponent (<i>γ</i>) calculation using <b>X</b>-distribution for different network models, and the real-world datasets used in our experiments are available here: https://github.com/Aikta-Arya/X-distribution-Retraceable-Power-Law-Exponent-of-Complex-Networks.git\\n</p>\",\"PeriodicalId\":49249,\"journal\":{\"name\":\"ACM Transactions on Knowledge Discovery from Data\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2023-12-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Knowledge Discovery from Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3639413\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3639413","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
通过理论分析和网络重建(NR)的数值模拟,人们对网络建模进行了广泛的探索。网络重构问题需要估计给定输入网络的幂律指数(γ)。因此,NR 解决方案的有效性取决于 γ 计算的准确性。在本文中,我们重新审视了基于阶数分布的 γ 估计方法,由于存在近似值,该方法的准确性不高。我们提出了 X 分布,它比度分布更准确。为了模拟目的,我们考虑了各种最先进的网络模型,包括 CPM、NRM、RefOrCite2、BA、CDPAM 和 DMS。此外,我们将 X 分布应用于几个真实世界的网络,计算出它们的幂律指数,这些指数与使用各自的度分布计算出的指数不同。与学位分布相比,X 分布在对数尺度上表现出更多的线性(直线)。因此,X 分布更适合使用线性拟合(对数-对数尺度)来评估幂律指数。针对不同网络模型使用 X 分布计算幂律指数(γ)的 MATLAB 实现,以及我们实验中使用的真实世界数据集,可在此处获取: https://github.com/Aikta-Arya/X-distribution-Retraceable-Power-Law-Exponent-of-Complex-Networks.git
X-distribution: Retraceable Power-Law Exponent of Complex Networks
Network modeling has been explored extensively by means of theoretical analysis as well as numerical simulations for Network Reconstruction (NR). The network reconstruction problem requires the estimation of the power-law exponent (γ) of a given input network. Thus, the effectiveness of the NR solution depends on the accuracy of the calculation of γ. In this article, we re-examine the degree distribution-based estimation of γ, which is not very accurate due to approximations. We propose X-distribution, which is more accurate as compared to degree distribution. Various state-of-the-art network models, including CPM, NRM, RefOrCite2, BA, CDPAM, and DMS, are considered for simulation purposes, and simulated results support the proposed claim. Further, we apply X-distribution over several real-world networks to calculate their power-law exponents, which differ from those calculated using respective degree distributions. It is observed that X-distributions exhibit more linearity (straight line) on the log-log scale as compared to degree distributions. Thus, X-distribution is more suitable for the evaluation of power-law exponent using linear fitting (on the log-log scale). The MATLAB implementation of power-law exponent (γ) calculation using X-distribution for different network models, and the real-world datasets used in our experiments are available here: https://github.com/Aikta-Arya/X-distribution-Retraceable-Power-Law-Exponent-of-Complex-Networks.git
期刊介绍:
TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.