Matthew Dionela, Carey Louise B. Arroyo, Mhica S. Torres, Miguel P. Alaan, Sandy C. Lauguico, R. R. Vicerra, R. Concepcion II
{"title":"MACHINE LEARNING METHODS FOR EARLY-STAGE DIAGNOSIS OF PARKINSON'S DISEASE THROUGH HANDWRITING DATA","authors":"Matthew Dionela, Carey Louise B. Arroyo, Mhica S. Torres, Miguel P. Alaan, Sandy C. Lauguico, R. R. Vicerra, R. Concepcion II","doi":"10.11113/aej.v13.18777","DOIUrl":null,"url":null,"abstract":"Parkinson's disease (PD) deteriorates human cognitive and motor functions, causing slowness of movements and postural shakiness. PD is currently incurable, and managing symptoms in its late stages is difficult. PD diagnosis also has gaps in accuracy due to several clinical challenges. Thus, early-stage detection of PD through its symptoms, such as handwriting abnormality, has become a popular research area using machine learning. Since most related studies focus on advanced algorithms, this study aims to determine the classification accuracies of simpler classical models using the NewHandPD-NewMeander dataset. This study used the 9 features extracted from the meanders drawn by healthy participants and participants diagnosed with Parkinson’s disease and 3 features about the individual. The same features were reduced to the 8 best according to univariate selection and recursive feature elimination. The machine learning algorithms used for the models in this study are Logistic regression, Multilayer perceptron, and Naive Bayes. Additionally, hyperparameter optimization was done. Results have shown that feature selection improved the performances of the default model, while optimization had varying effects depending on the feature selection method used. Among 15 models built, Multilayer perceptron, which utilized top 8 features from univariate selection with default hyperparameters (MLPU8), performed best. It yielded an accuracy of 84.4% in cross-validation, 87.5% in holdout validation, and an F1-score of 87.5%. Remaining models had accuracies ranging from 81.4% - 84.4% in cross-validations and 82.5% - 85.0% in holdout validations. Other studies done on diagnosing PD using similar handwritten datasets resulted in lower accuracies of 87.14% and 77.38% despite utilizing complex algorithms for its models. This proved that the 15 models built using simple architecture can outperform complex classification methods. The 15 models built accurately classify meander data and can be used as an early assessment tool for detecting PD.","PeriodicalId":36749,"journal":{"name":"ASEAN Engineering Journal","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASEAN Engineering Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/aej.v13.18777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Parkinson's disease (PD) deteriorates human cognitive and motor functions, causing slowness of movements and postural shakiness. PD is currently incurable, and managing symptoms in its late stages is difficult. PD diagnosis also has gaps in accuracy due to several clinical challenges. Thus, early-stage detection of PD through its symptoms, such as handwriting abnormality, has become a popular research area using machine learning. Since most related studies focus on advanced algorithms, this study aims to determine the classification accuracies of simpler classical models using the NewHandPD-NewMeander dataset. This study used the 9 features extracted from the meanders drawn by healthy participants and participants diagnosed with Parkinson’s disease and 3 features about the individual. The same features were reduced to the 8 best according to univariate selection and recursive feature elimination. The machine learning algorithms used for the models in this study are Logistic regression, Multilayer perceptron, and Naive Bayes. Additionally, hyperparameter optimization was done. Results have shown that feature selection improved the performances of the default model, while optimization had varying effects depending on the feature selection method used. Among 15 models built, Multilayer perceptron, which utilized top 8 features from univariate selection with default hyperparameters (MLPU8), performed best. It yielded an accuracy of 84.4% in cross-validation, 87.5% in holdout validation, and an F1-score of 87.5%. Remaining models had accuracies ranging from 81.4% - 84.4% in cross-validations and 82.5% - 85.0% in holdout validations. Other studies done on diagnosing PD using similar handwritten datasets resulted in lower accuracies of 87.14% and 77.38% despite utilizing complex algorithms for its models. This proved that the 15 models built using simple architecture can outperform complex classification methods. The 15 models built accurately classify meander data and can be used as an early assessment tool for detecting PD.