{"title":"A Classification Study in High-Dimensional Data of Linear Discriminant Analysis and Regularized Discriminant Analysis","authors":"Autcha Araveeporn, Somsri Banditvilai","doi":"10.37394/23206.2023.22.37","DOIUrl":null,"url":null,"abstract":"The objective of this work is to compare linear discriminant analysis (LDA) and regularized discriminant analysis (RDA) for classification in high-dimensional data. This dataset consists of the response variable as a binary or dichotomous variable and the explanatory as a continuous variable. The LDA and RDA methods are well-known in statistical and probabilistic learning classification. The LDA has created the decision boundary as a linear function where the covariance of two classes is equal. Then the RDA is extended from the LDA to resolve the estimated covariance when the number of observations exceeds the explanatory variables, or called high-dimensional data. The explanatory dataset is generated from the normal distribution, contaminated normal distribution, and uniform distribution. The binary of the response variables is computed from the logit function depending on the explanatory variable. The highest average accuracy percentage evaluates to propose the performance of the classification methods in several situations. Through simulation results, the LDA was successful when using large sample sizes, but the RDA performed when using the most sample sizes.","PeriodicalId":55878,"journal":{"name":"WSEAS Transactions on Mathematics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WSEAS Transactions on Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37394/23206.2023.22.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
The objective of this work is to compare linear discriminant analysis (LDA) and regularized discriminant analysis (RDA) for classification in high-dimensional data. This dataset consists of the response variable as a binary or dichotomous variable and the explanatory as a continuous variable. The LDA and RDA methods are well-known in statistical and probabilistic learning classification. The LDA has created the decision boundary as a linear function where the covariance of two classes is equal. Then the RDA is extended from the LDA to resolve the estimated covariance when the number of observations exceeds the explanatory variables, or called high-dimensional data. The explanatory dataset is generated from the normal distribution, contaminated normal distribution, and uniform distribution. The binary of the response variables is computed from the logit function depending on the explanatory variable. The highest average accuracy percentage evaluates to propose the performance of the classification methods in several situations. Through simulation results, the LDA was successful when using large sample sizes, but the RDA performed when using the most sample sizes.
期刊介绍:
WSEAS Transactions on Mathematics publishes original research papers relating to applied and theoretical mathematics. We aim to bring important work to a wide international audience and therefore only publish papers of exceptional scientific value that advance our understanding of these particular areas. The research presented must transcend the limits of case studies, while both experimental and theoretical studies are accepted. It is a multi-disciplinary journal and therefore its content mirrors the diverse interests and approaches of scholars involved with linear algebra, numerical analysis, differential equations, statistics and related areas. We also welcome scholarly contributions from officials with government agencies, international agencies, and non-governmental organizations.