{"title":"使用堆叠集成深度学习模型增强对困难言语障碍的早期检测","authors":"Jagat Chaitanya Prabhala , Ravi Ragoju , Venkatanareshbabu Kuppili , Christophe Chesneau","doi":"10.1016/j.mlwa.2025.100721","DOIUrl":null,"url":null,"abstract":"<div><div>Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive Dysarthric Speech Disability Detection using Stacked Ensemble Deep Learning (ADSDD-SEDL), an innovative ensemble-based deep-learning framework for dysarthria detection. The proposed model integrates three deep learning architectures—Multi-Head Attention-based Long Short-Term Memory (MHALSTM), Deep Belief Network (DBN), and Time-Delay Neural Network (TDNN)—within a stacked ensemble model. Unlike conventional stacking methods that use fixed meta-classifiers, this study employs a Genetic Algorithm (GA)-based optimization strategy to dynamically determine optimal weight contributions of the base models, enhancing classification robustness and adaptability.</div><div>The preprocessing pipeline converts speech signals from the time domain to the frequency domain by using a Short-Time Fourier Transform (STFT). Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to capture the key spectral characteristics. Each base model underwent independent training, and the GA optimized the ensemble by evolving an adaptive weight distribution instead of relying on predefined fusion methods. Extensive simulations and hyperparameter tuning confirmed that the GA-optimized ADSDD-SEDL technique significantly improved detection efficiency over traditional ensemble approaches. These findings underscore the advantages of evolutionary optimization in refining speech disorder classification models. This scalable and adaptive model offers a valuable tool for healthcare professionals, enabling precise and automated early diagnosis of dysarthria. Future research could explore alternative evolutionary algorithms, reinforcement learning techniques, and hybrid deep learning approaches to enhance speech disorder classification.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"21 ","pages":"Article 100721"},"PeriodicalIF":4.9000,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model\",\"authors\":\"Jagat Chaitanya Prabhala , Ravi Ragoju , Venkatanareshbabu Kuppili , Christophe Chesneau\",\"doi\":\"10.1016/j.mlwa.2025.100721\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive Dysarthric Speech Disability Detection using Stacked Ensemble Deep Learning (ADSDD-SEDL), an innovative ensemble-based deep-learning framework for dysarthria detection. The proposed model integrates three deep learning architectures—Multi-Head Attention-based Long Short-Term Memory (MHALSTM), Deep Belief Network (DBN), and Time-Delay Neural Network (TDNN)—within a stacked ensemble model. Unlike conventional stacking methods that use fixed meta-classifiers, this study employs a Genetic Algorithm (GA)-based optimization strategy to dynamically determine optimal weight contributions of the base models, enhancing classification robustness and adaptability.</div><div>The preprocessing pipeline converts speech signals from the time domain to the frequency domain by using a Short-Time Fourier Transform (STFT). Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to capture the key spectral characteristics. Each base model underwent independent training, and the GA optimized the ensemble by evolving an adaptive weight distribution instead of relying on predefined fusion methods. Extensive simulations and hyperparameter tuning confirmed that the GA-optimized ADSDD-SEDL technique significantly improved detection efficiency over traditional ensemble approaches. These findings underscore the advantages of evolutionary optimization in refining speech disorder classification models. This scalable and adaptive model offers a valuable tool for healthcare professionals, enabling precise and automated early diagnosis of dysarthria. Future research could explore alternative evolutionary algorithms, reinforcement learning techniques, and hybrid deep learning approaches to enhance speech disorder classification.</div></div>\",\"PeriodicalId\":74093,\"journal\":{\"name\":\"Machine learning with applications\",\"volume\":\"21 \",\"pages\":\"Article 100721\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine learning with applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666827025001045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827025001045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhanced early detection of dysarthric speech disabilities using stacking ensemble deep learning model
Communication disorders, particularly dysarthria, significantly impact individuals by impairing their speech clarity, social interactions, and overall well-being. Early and accurate detection is crucial to enable timely intervention and improve speech therapy outcomes. This study introduces Adaptive Dysarthric Speech Disability Detection using Stacked Ensemble Deep Learning (ADSDD-SEDL), an innovative ensemble-based deep-learning framework for dysarthria detection. The proposed model integrates three deep learning architectures—Multi-Head Attention-based Long Short-Term Memory (MHALSTM), Deep Belief Network (DBN), and Time-Delay Neural Network (TDNN)—within a stacked ensemble model. Unlike conventional stacking methods that use fixed meta-classifiers, this study employs a Genetic Algorithm (GA)-based optimization strategy to dynamically determine optimal weight contributions of the base models, enhancing classification robustness and adaptability.
The preprocessing pipeline converts speech signals from the time domain to the frequency domain by using a Short-Time Fourier Transform (STFT). Mel-Frequency Cepstral Coefficients (MFCCs) were extracted to capture the key spectral characteristics. Each base model underwent independent training, and the GA optimized the ensemble by evolving an adaptive weight distribution instead of relying on predefined fusion methods. Extensive simulations and hyperparameter tuning confirmed that the GA-optimized ADSDD-SEDL technique significantly improved detection efficiency over traditional ensemble approaches. These findings underscore the advantages of evolutionary optimization in refining speech disorder classification models. This scalable and adaptive model offers a valuable tool for healthcare professionals, enabling precise and automated early diagnosis of dysarthria. Future research could explore alternative evolutionary algorithms, reinforcement learning techniques, and hybrid deep learning approaches to enhance speech disorder classification.