{"title":"Automated and interpretable m-health discrimination of vocal cord pathology enabled by machine learning","authors":"Nabeel Seedat, V. Aharonson, Y. Hamzany","doi":"10.1109/CSDE50874.2020.9411529","DOIUrl":null,"url":null,"abstract":"Clinical methods that assess voice pathologies are typically based on laryngeal endoscopy or audio-perceptual assessment. Both methods have limited accessibility in low-resourced healthcare settings. M-health systems can provide a quantitative assessment and improve early detection in a patient centered care. Automated methods for voice pathologies assessment apply machine learning methods to acoustic, frequency and noise features extracted from sustained phonation recordings and aim to discriminate pathological voices from controls. The machine learning methods in this study are applied to a discriminating between two prevalent vocal pathologies: vocal cord polyp and vocal cord paralysis. The data was acquired by a low-cost recording device in an experiment at a tertiary medical center and the pathologies were clinically labeled. Acoustic and spectral features were extracted and multiple classifiers compared using batched cross validation. The best classifiers were tree-based classifiers, with the Extra Trees classifier providing the best performance with an accuracy of 0.9565 and F1-score of 0.9130. Explainable AI (XAI) and feature interpretability analysis was carried out to allow clinicians to use the features marked as important to clinical care and planning. The most important features were octave-based spectral contrast and MFCCs 0 to 3. The results indicate a feasibility of machine learning to accurately discriminate between different types of vocal cord pathologies.","PeriodicalId":445708,"journal":{"name":"2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSDE50874.2020.9411529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Clinical methods that assess voice pathologies are typically based on laryngeal endoscopy or audio-perceptual assessment. Both methods have limited accessibility in low-resourced healthcare settings. M-health systems can provide a quantitative assessment and improve early detection in a patient centered care. Automated methods for voice pathologies assessment apply machine learning methods to acoustic, frequency and noise features extracted from sustained phonation recordings and aim to discriminate pathological voices from controls. The machine learning methods in this study are applied to a discriminating between two prevalent vocal pathologies: vocal cord polyp and vocal cord paralysis. The data was acquired by a low-cost recording device in an experiment at a tertiary medical center and the pathologies were clinically labeled. Acoustic and spectral features were extracted and multiple classifiers compared using batched cross validation. The best classifiers were tree-based classifiers, with the Extra Trees classifier providing the best performance with an accuracy of 0.9565 and F1-score of 0.9130. Explainable AI (XAI) and feature interpretability analysis was carried out to allow clinicians to use the features marked as important to clinical care and planning. The most important features were octave-based spectral contrast and MFCCs 0 to 3. The results indicate a feasibility of machine learning to accurately discriminate between different types of vocal cord pathologies.