Dávid Sztahó, Attila Zoltán Jenei, I. Valálik, K. Vicsi
{"title":"语音片段化和音频编码对帕金森病自动识别的影响","authors":"Dávid Sztahó, Attila Zoltán Jenei, I. Valálik, K. Vicsi","doi":"10.4236/jbise.2022.151002","DOIUrl":null,"url":null,"abstract":"Parkinson’s disease is a neurological disease which is incurable according to current clinical knowledge. Therefore, early detection and provision of appropriate treatment are of prima-ry importance. Speech is one of the biomarkers that enable the detection of Parkinson’s disease affection. Numerous researches are based on recordings from controlled environ-ments; nonetheless fewer apply real circumstances. In the present study, three objectives were examined: recording fragmentation (paragraph, sentences, time-based), variable encodings (Pulse-Code Modulation [PCM], GSM-Full Rate [FR], G.723.1) and majority voting on 8 kHz records using multiple classifiers. Support Vector Machine (SVM), Long Short-Term Memory (LSTM), i-vector and x-vector classifiers were evaluated in contrast with SVM as baseline. The highest results in accuracy and F1-score were achieved using i-vector","PeriodicalId":64231,"journal":{"name":"生物医学工程(英文)","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The Effect of Speech Fragmentation and Audio Encodings on Automatic Parkinson’s Disease Recognition\",\"authors\":\"Dávid Sztahó, Attila Zoltán Jenei, I. Valálik, K. Vicsi\",\"doi\":\"10.4236/jbise.2022.151002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parkinson’s disease is a neurological disease which is incurable according to current clinical knowledge. Therefore, early detection and provision of appropriate treatment are of prima-ry importance. Speech is one of the biomarkers that enable the detection of Parkinson’s disease affection. Numerous researches are based on recordings from controlled environ-ments; nonetheless fewer apply real circumstances. In the present study, three objectives were examined: recording fragmentation (paragraph, sentences, time-based), variable encodings (Pulse-Code Modulation [PCM], GSM-Full Rate [FR], G.723.1) and majority voting on 8 kHz records using multiple classifiers. Support Vector Machine (SVM), Long Short-Term Memory (LSTM), i-vector and x-vector classifiers were evaluated in contrast with SVM as baseline. The highest results in accuracy and F1-score were achieved using i-vector\",\"PeriodicalId\":64231,\"journal\":{\"name\":\"生物医学工程(英文)\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"生物医学工程(英文)\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.4236/jbise.2022.151002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"生物医学工程(英文)","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.4236/jbise.2022.151002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Effect of Speech Fragmentation and Audio Encodings on Automatic Parkinson’s Disease Recognition
Parkinson’s disease is a neurological disease which is incurable according to current clinical knowledge. Therefore, early detection and provision of appropriate treatment are of prima-ry importance. Speech is one of the biomarkers that enable the detection of Parkinson’s disease affection. Numerous researches are based on recordings from controlled environ-ments; nonetheless fewer apply real circumstances. In the present study, three objectives were examined: recording fragmentation (paragraph, sentences, time-based), variable encodings (Pulse-Code Modulation [PCM], GSM-Full Rate [FR], G.723.1) and majority voting on 8 kHz records using multiple classifiers. Support Vector Machine (SVM), Long Short-Term Memory (LSTM), i-vector and x-vector classifiers were evaluated in contrast with SVM as baseline. The highest results in accuracy and F1-score were achieved using i-vector