Jiyue Zhao, Tony Yuxiang Pan, Weibo Yao, Hongwei Lu, Zihan Liu
{"title":"Analysis of classification algorithms: Insights from MNIST and WDBC datasets","authors":"Jiyue Zhao, Tony Yuxiang Pan, Weibo Yao, Hongwei Lu, Zihan Liu","doi":"10.54254/2755-2721/79/20241622","DOIUrl":null,"url":null,"abstract":"Various classification algorithms applied to sophisticated datasets have seen significant development over the years, which involves dealing with the growing complexities of real-world data and providing efficient solutions for numerous domains like healthcare and data analysis. There is a critical need to identify the most effective algorithms to deliver high precision and generalizability. This study intends to assess diverse models, including Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), DTs (DT), and Random Forests (RF), on Modified National Institute of Standards and Technology (MNIST) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets, utilizing metrics like Overall Accuracy (OA), Average Accuracy (AA), and Cohens kappa. The study has shown that the performance of the algorithms is mainly determined by the dataset's features. Additionally, insights into the strengths and limitations of each model are provided.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"37 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54254/2755-2721/79/20241622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Various classification algorithms applied to sophisticated datasets have seen significant development over the years, which involves dealing with the growing complexities of real-world data and providing efficient solutions for numerous domains like healthcare and data analysis. There is a critical need to identify the most effective algorithms to deliver high precision and generalizability. This study intends to assess diverse models, including Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), DTs (DT), and Random Forests (RF), on Modified National Institute of Standards and Technology (MNIST) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets, utilizing metrics like Overall Accuracy (OA), Average Accuracy (AA), and Cohens kappa. The study has shown that the performance of the algorithms is mainly determined by the dataset's features. Additionally, insights into the strengths and limitations of each model are provided.