{"title":"Using Genetic Programming to Identify Probability Distribution behind Data: A Preliminary Trial","authors":"Yang Syu, Chien-Min Wang","doi":"10.1109/ICDMW58026.2022.00056","DOIUrl":null,"url":null,"abstract":"Before conducting any further applications or performing more advanced processing, analyzing and realizing the probability distribution of data is a crucial task. Traditionally, statistical methods are being developed for this procedure. In recent years, researchers in computer science have proposed and applied different machine learning-based techniques to address the abovementioned problem. However, the existing solutions remain problematic and inconvenient, such as the need for human intervention and the complexity of the resulting models. Thus, in this paper, without causing deficiency and inconvenience, a genetic programming-based approach for the identification of probability functions is proposed, implemented, and tested. Based on our empirical trials, in an immense search space of mathematical expressions, the proposed and developed approach can effectively recognize (retrieve) the probability distribution function behind data.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Before conducting any further applications or performing more advanced processing, analyzing and realizing the probability distribution of data is a crucial task. Traditionally, statistical methods are being developed for this procedure. In recent years, researchers in computer science have proposed and applied different machine learning-based techniques to address the abovementioned problem. However, the existing solutions remain problematic and inconvenient, such as the need for human intervention and the complexity of the resulting models. Thus, in this paper, without causing deficiency and inconvenience, a genetic programming-based approach for the identification of probability functions is proposed, implemented, and tested. Based on our empirical trials, in an immense search space of mathematical expressions, the proposed and developed approach can effectively recognize (retrieve) the probability distribution function behind data.