Shuaichi Ma, Wenwen Liao, Yi Zhang, Fan Zhang, Yimiao Wang, Zhiyan Lu, Chen Zhao, Jianbo Yu, Peijie He
{"title":"Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.","authors":"Shuaichi Ma, Wenwen Liao, Yi Zhang, Fan Zhang, Yimiao Wang, Zhiyan Lu, Chen Zhao, Jianbo Yu, Peijie He","doi":"10.1186/s12938-025-01401-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study aims to develop an AI-powered platform using Mel-spectrogram analysis and convolutional neural networks (CNN) to automate the severity assessment of unilateral vocal fold paralysis (UVCP) through voice analysis, providing an objective basis for individualized clinical treatment plans.</p><p><strong>Methods: </strong>To accurately identify the severity of UVCP, this study developed the CNN model TripleConvNet. Voice samples were collected from 131 healthy individuals and 292 confirmed UVCP patients from the Eye and ENT Hospital of Fudan University. Based on vocal fold compensation function, the patients were divided into three groups: decompensated (84 cases), partially compensated (98 cases), and fully compensated (110 cases). Using Mel-spectrograms and their first- and second-order differential features as inputs, the TripleConvNet model classified patients by severity and was systematically evaluated for its performance in UVCP severity grading tasks.</p><p><strong>Results: </strong>TripleConvNet achieved a classification accuracy of 74.3% in distinguishing between healthy voices and the UVCP decompensated, partially compensated, and fully compensated groups.</p><p><strong>Conclusion: </strong>This study demonstrates the potential of deep learning-based non-invasive voice analysis for precise grading of UVCP severity. The proposed method offers a promising clinical tool to assist physicians in disease assessment and personalized treatment planning.</p>","PeriodicalId":8927,"journal":{"name":"BioMedical Engineering OnLine","volume":"24 1","pages":"76"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12181906/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioMedical Engineering OnLine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s12938-025-01401-9","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: This study aims to develop an AI-powered platform using Mel-spectrogram analysis and convolutional neural networks (CNN) to automate the severity assessment of unilateral vocal fold paralysis (UVCP) through voice analysis, providing an objective basis for individualized clinical treatment plans.
Methods: To accurately identify the severity of UVCP, this study developed the CNN model TripleConvNet. Voice samples were collected from 131 healthy individuals and 292 confirmed UVCP patients from the Eye and ENT Hospital of Fudan University. Based on vocal fold compensation function, the patients were divided into three groups: decompensated (84 cases), partially compensated (98 cases), and fully compensated (110 cases). Using Mel-spectrograms and their first- and second-order differential features as inputs, the TripleConvNet model classified patients by severity and was systematically evaluated for its performance in UVCP severity grading tasks.
Results: TripleConvNet achieved a classification accuracy of 74.3% in distinguishing between healthy voices and the UVCP decompensated, partially compensated, and fully compensated groups.
Conclusion: This study demonstrates the potential of deep learning-based non-invasive voice analysis for precise grading of UVCP severity. The proposed method offers a promising clinical tool to assist physicians in disease assessment and personalized treatment planning.
期刊介绍:
BioMedical Engineering OnLine is an open access, peer-reviewed journal that is dedicated to publishing research in all areas of biomedical engineering.
BioMedical Engineering OnLine is aimed at readers and authors throughout the world, with an interest in using tools of the physical and data sciences and techniques in engineering to understand and solve problems in the biological and medical sciences. Topical areas include, but are not limited to:
Bioinformatics-
Bioinstrumentation-
Biomechanics-
Biomedical Devices & Instrumentation-
Biomedical Signal Processing-
Healthcare Information Systems-
Human Dynamics-
Neural Engineering-
Rehabilitation Engineering-
Biomaterials-
Biomedical Imaging & Image Processing-
BioMEMS and On-Chip Devices-
Bio-Micro/Nano Technologies-
Biomolecular Engineering-
Biosensors-
Cardiovascular Systems Engineering-
Cellular Engineering-
Clinical Engineering-
Computational Biology-
Drug Delivery Technologies-
Modeling Methodologies-
Nanomaterials and Nanotechnology in Biomedicine-
Respiratory Systems Engineering-
Robotics in Medicine-
Systems and Synthetic Biology-
Systems Biology-
Telemedicine/Smartphone Applications in Medicine-
Therapeutic Systems, Devices and Technologies-
Tissue Engineering