探索Kumaraswamy离散半logistic分布在数据科学扫描和决策中的潜力

Q1 Decision Sciences

Annals of Data Science Pub Date : 2024-09-24 DOI:10.1007/s40745-024-00558-9

Hend S. Shahen, Mohamed S. Eliwa, Mahmoud El-Morshedy

{"title":"探索Kumaraswamy离散半logistic分布在数据科学扫描和决策中的潜力","authors":"Hend S. Shahen, Mohamed S. Eliwa, Mahmoud El-Morshedy","doi":"10.1007/s40745-024-00558-9","DOIUrl":null,"url":null,"abstract":"<div><p>Data science often employs discrete probability distributions to model and analyze various phenomena. These distributions are particularly useful when dealing with data that can be categorized into distinct outcomes or events. This study presents a discrete random probability model, supported by non-negative integers, formulated from the well-established Kumaraswamy family through a recognized discretization method, preserving the survival function’s functional structure. Various significant statistical properties like hazard rate function, crude moments, index of dispersion, skewness, kurtosis, quantile function, L-moments, and entropies are derived. This new probability mass function allows for the analysis of asymmetric dispersion data across different kurtosis forms, including mesokurtic, platykurtic, and leptokurtic distributions. Furthermore, this model effectively handles excess zeros, under and over dispersion commonly encountered in diverse fields. Additionally, the hazard rate function demonstrates considerable flexibility, encompassing monotonic decreasing, bathtub, monotonously increasing, and bathtub-constant failure rate characteristics. Following the theoretical introduction of this new discrete model, model parameters are estimated through maximum likelihood estimation, with a subsequent discussion on the performance of this technique through a simulation study. Finally, three real-world applications employing count data demonstrate the significance and adaptability of this novel discrete distribution.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"1013 - 1040"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the Potential of the Kumaraswamy Discrete Half-Logistic Distribution in Data Science Scanning and Decision-Making\",\"authors\":\"Hend S. Shahen, Mohamed S. Eliwa, Mahmoud El-Morshedy\",\"doi\":\"10.1007/s40745-024-00558-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Data science often employs discrete probability distributions to model and analyze various phenomena. These distributions are particularly useful when dealing with data that can be categorized into distinct outcomes or events. This study presents a discrete random probability model, supported by non-negative integers, formulated from the well-established Kumaraswamy family through a recognized discretization method, preserving the survival function’s functional structure. Various significant statistical properties like hazard rate function, crude moments, index of dispersion, skewness, kurtosis, quantile function, L-moments, and entropies are derived. This new probability mass function allows for the analysis of asymmetric dispersion data across different kurtosis forms, including mesokurtic, platykurtic, and leptokurtic distributions. Furthermore, this model effectively handles excess zeros, under and over dispersion commonly encountered in diverse fields. Additionally, the hazard rate function demonstrates considerable flexibility, encompassing monotonic decreasing, bathtub, monotonously increasing, and bathtub-constant failure rate characteristics. Following the theoretical introduction of this new discrete model, model parameters are estimated through maximum likelihood estimation, with a subsequent discussion on the performance of this technique through a simulation study. Finally, three real-world applications employing count data demonstrate the significance and adaptability of this novel discrete distribution.</p></div>\",\"PeriodicalId\":36280,\"journal\":{\"name\":\"Annals of Data Science\",\"volume\":\"12 3\",\"pages\":\"1013 - 1040\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s40745-024-00558-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Decision Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00558-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

摘要

数据科学经常使用离散概率分布来建模和分析各种现象。当处理可以分类为不同结果或事件的数据时，这些分布特别有用。本文提出了一个非负整数支持的离散随机概率模型，该模型采用公认的离散化方法，从已建立的Kumaraswamy族出发，保留了生存函数的功能结构。导出了各种重要的统计性质，如危险率函数、粗矩、分散指数、偏度、峰度、分位数函数、l矩和熵。这种新的概率质量函数允许分析不同峰度形式的不对称色散数据，包括中峰度分布、平峰度分布和细峰度分布。此外，该模型有效地处理了在不同领域中常见的过零、过散和过散。此外，危险率函数表现出相当大的灵活性，包括单调下降、浴缸、单调增加和浴缸恒定故障率特征。在对这种新的离散模型进行理论介绍之后，通过极大似然估计来估计模型参数，随后通过仿真研究讨论了该技术的性能。最后，三个使用计数数据的实际应用证明了这种新型离散分布的重要性和适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploring the Potential of the Kumaraswamy Discrete Half-Logistic Distribution in Data Science Scanning and Decision-Making

Data science often employs discrete probability distributions to model and analyze various phenomena. These distributions are particularly useful when dealing with data that can be categorized into distinct outcomes or events. This study presents a discrete random probability model, supported by non-negative integers, formulated from the well-established Kumaraswamy family through a recognized discretization method, preserving the survival function’s functional structure. Various significant statistical properties like hazard rate function, crude moments, index of dispersion, skewness, kurtosis, quantile function, L-moments, and entropies are derived. This new probability mass function allows for the analysis of asymmetric dispersion data across different kurtosis forms, including mesokurtic, platykurtic, and leptokurtic distributions. Furthermore, this model effectively handles excess zeros, under and over dispersion commonly encountered in diverse fields. Additionally, the hazard rate function demonstrates considerable flexibility, encompassing monotonic decreasing, bathtub, monotonously increasing, and bathtub-constant failure rate characteristics. Following the theoretical introduction of this new discrete model, model parameters are estimated through maximum likelihood estimation, with a subsequent discussion on the performance of this technique through a simulation study. Finally, three real-world applications employing count data demonstrate the significance and adaptability of this novel discrete distribution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Annals of Data Science Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

6.50

自引率

0.00%

发文量

期刊介绍： Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.