Classification and analysis of the MNIST dataset using PCA and SVM algorithms

Mokhaled N. A. Al-Hamadani
{"title":"Classification and analysis of the MNIST dataset using PCA and SVM algorithms","authors":"Mokhaled N. A. Al-Hamadani","doi":"10.5937/vojtehg71-42689","DOIUrl":null,"url":null,"abstract":"Introduction/purpose: The utilization of machine learning methods has become indispensable in analyzing large-scale, complex data in contemporary data-driven environments, with a diverse range of applications from optimizing business operations to advancing scientific research. Despite the potential for insight and innovation presented by these voluminous datasets, they pose significant challenges in areas such as data quality and structure, necessitating the implementation of effective management strategies. Machine learning techniques have emerged as essential tools in identifying and mitigating these challenges and developing viable solutions to address them. The MNIST dataset represents a prominent example of a widely-used dataset in this field, renowned for its expansive collection of handwritten numerical digits, and frequently employed in tasks such as classification and analysis, as demonstrated in the present study. Methods: This study employed the MNIST dataset to investigate various statistical techniques, including the Principal Components Analysis (PCA) algorithm implemented using the Python programming language. Additionally, Support Vector Machine (SVM) models were applied to both linear and non-linear classification problems to assess the accuracy of the model. Results: The results of the present study indicate that while the PCA technique is effective for dimensionality reduction, it may not be as effective for visualization purposes. Moreover, the findings demonstrate that both linear and non-linear SVM models were capable of effectively classifying the dataset. Conclusion: The findings of the study demonstrate that SVM can serve as an efficacious technique for addressing classification problems.","PeriodicalId":30576,"journal":{"name":"Vojnotehnicki Glasnik","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vojnotehnicki Glasnik","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5937/vojtehg71-42689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Introduction/purpose: The utilization of machine learning methods has become indispensable in analyzing large-scale, complex data in contemporary data-driven environments, with a diverse range of applications from optimizing business operations to advancing scientific research. Despite the potential for insight and innovation presented by these voluminous datasets, they pose significant challenges in areas such as data quality and structure, necessitating the implementation of effective management strategies. Machine learning techniques have emerged as essential tools in identifying and mitigating these challenges and developing viable solutions to address them. The MNIST dataset represents a prominent example of a widely-used dataset in this field, renowned for its expansive collection of handwritten numerical digits, and frequently employed in tasks such as classification and analysis, as demonstrated in the present study. Methods: This study employed the MNIST dataset to investigate various statistical techniques, including the Principal Components Analysis (PCA) algorithm implemented using the Python programming language. Additionally, Support Vector Machine (SVM) models were applied to both linear and non-linear classification problems to assess the accuracy of the model. Results: The results of the present study indicate that while the PCA technique is effective for dimensionality reduction, it may not be as effective for visualization purposes. Moreover, the findings demonstrate that both linear and non-linear SVM models were capable of effectively classifying the dataset. Conclusion: The findings of the study demonstrate that SVM can serve as an efficacious technique for addressing classification problems.
使用PCA和SVM算法对MNIST数据集进行分类和分析
简介/目的:在当代数据驱动的环境中,利用机器学习方法分析大规模、复杂的数据已经成为必不可少的,从优化业务运营到推进科学研究的各种应用。尽管这些庞大的数据集提供了洞察和创新的潜力,但它们在数据质量和结构等领域提出了重大挑战,需要实施有效的管理策略。机器学习技术已经成为识别和缓解这些挑战并开发可行解决方案的重要工具。MNIST数据集是该领域广泛使用的数据集的一个突出例子,以其广泛的手写数字集合而闻名,并经常用于分类和分析等任务,如本研究所示。方法:本研究采用MNIST数据集研究各种统计技术,包括使用Python编程语言实现的主成分分析(PCA)算法。此外,将支持向量机(SVM)模型应用于线性和非线性分类问题,以评估模型的准确性。结果:本研究的结果表明,虽然PCA技术对降维是有效的,但它对可视化目的可能不那么有效。此外,研究结果表明,线性和非线性SVM模型都能够有效地对数据集进行分类。结论:研究结果表明,支持向量机可以作为一种有效的技术来解决分类问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
24
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信