层修改残差Unet++语音增强用Aquila黑寡妇优化算法。

IF 1.6

Network (Bristol, England) Pub Date : 2025-07-27 DOI:10.1080/0954898X.2025.2533866

Thangappanpillai Murugan Minipriya, Ramadoss Rajavel

{"title":"层修改残差Unet++语音增强用Aquila黑寡妇优化算法。","authors":"Thangappanpillai Murugan Minipriya, Ramadoss Rajavel","doi":"10.1080/0954898X.2025.2533866","DOIUrl":null,"url":null,"abstract":"Speech enhancement techniques face computational demands, well-developed datasets, and better quality speech signals. Deep learners help deal with different noise types; still, the challenges offered by environmental noises require highly efficient and robust systems. This paper presents a lightweight deep-learning design with a heuristic-inspired model for generating an enhanced speech signal from noisy speech data. The model aims to remove different environmental noises affecting the speech signal. The noisy speech data are converted into spectrograms with Short-Time Fourier Transform (STFT). The noisy spectrogram is processed through the newly developed speech enhancement model namely, Layer Modified Residual Unet++ (LMResUnet++). The developed LMResUnet++ is designed through an atrous convolution layer, and it can capture multi-scale information without additional training parameter requirements. Also, the design is made compactable through the proposed hybrid optimization algorithm namely, Aquila Black Widow Optimization (ABWO), and it optimizes various hyperparameters of the developed LMResUnet++. The final denoised spectrogram from the LMResUnet++ undergoes Inverse STFT, and the final enhanced speech signal is restored. Further, different experiments are held to prove the efficacy of the system. Results prove that the developed LMResUnet++ achieved PESQ values of 7.93%, 5.75%, 3.86%, and 1.90% improved than DeepUnet, MTCNN, STCNN, and ResUnet++ respectively.","PeriodicalId":520718,"journal":{"name":"Network (Bristol, England)","volume":" ","pages":"1-49"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Layer modified residual Unet++ for speech enhancement using Aquila Black widow optimizer algorithm.\",\"authors\":\"Thangappanpillai Murugan Minipriya, Ramadoss Rajavel\",\"doi\":\"10.1080/0954898X.2025.2533866\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech enhancement techniques face computational demands, well-developed datasets, and better quality speech signals. Deep learners help deal with different noise types; still, the challenges offered by environmental noises require highly efficient and robust systems. This paper presents a lightweight deep-learning design with a heuristic-inspired model for generating an enhanced speech signal from noisy speech data. The model aims to remove different environmental noises affecting the speech signal. The noisy speech data are converted into spectrograms with Short-Time Fourier Transform (STFT). The noisy spectrogram is processed through the newly developed speech enhancement model namely, Layer Modified Residual Unet++ (LMResUnet++). The developed LMResUnet++ is designed through an atrous convolution layer, and it can capture multi-scale information without additional training parameter requirements. Also, the design is made compactable through the proposed hybrid optimization algorithm namely, Aquila Black Widow Optimization (ABWO), and it optimizes various hyperparameters of the developed LMResUnet++. The final denoised spectrogram from the LMResUnet++ undergoes Inverse STFT, and the final enhanced speech signal is restored. Further, different experiments are held to prove the efficacy of the system. Results prove that the developed LMResUnet++ achieved PESQ values of 7.93%, 5.75%, 3.86%, and 1.90% improved than DeepUnet, MTCNN, STCNN, and ResUnet++ respectively.\",\"PeriodicalId\":520718,\"journal\":{\"name\":\"Network (Bristol, England)\",\"volume\":\" \",\"pages\":\"1-49\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Network (Bristol, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/0954898X.2025.2533866\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Network (Bristol, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/0954898X.2025.2533866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

语音增强技术面临着计算需求、完善的数据集和更高质量的语音信号。深度学习者帮助处理不同类型的噪音；然而，环境噪声带来的挑战需要高效和强大的系统。本文提出了一种轻量级的深度学习设计，采用启发式模型从噪声语音数据中生成增强语音信号。该模型旨在去除影响语音信号的各种环境噪声。利用短时傅里叶变换（STFT）将噪声语音数据转换成频谱图。通过新开发的语音增强模型——层修正残差Unet++ (LMResUnet++)对噪声谱图进行处理。所开发的lmresunet++是通过一个属性卷积层来设计的，它可以在不需要额外训练参数的情况下捕获多尺度信息。同时，通过提出的混合优化算法Aquila Black Widow optimization （ABWO）对所开发的lmresunet++的各种超参数进行优化，使设计更加紧凑。对LMResUnet++的最终去噪频谱图进行逆STFT处理，最终得到增强语音信号。此外，还进行了不同的实验来证明系统的有效性。结果表明，与DeepUnet、MTCNN、STCNN和resunet++相比，lmresunet++的PESQ值分别提高了7.93%、5.75%、3.86%和1.90%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Layer modified residual Unet++ for speech enhancement using Aquila Black widow optimizer algorithm.

Speech enhancement techniques face computational demands, well-developed datasets, and better quality speech signals. Deep learners help deal with different noise types; still, the challenges offered by environmental noises require highly efficient and robust systems. This paper presents a lightweight deep-learning design with a heuristic-inspired model for generating an enhanced speech signal from noisy speech data. The model aims to remove different environmental noises affecting the speech signal. The noisy speech data are converted into spectrograms with Short-Time Fourier Transform (STFT). The noisy spectrogram is processed through the newly developed speech enhancement model namely, Layer Modified Residual Unet++ (LMResUnet++). The developed LMResUnet++ is designed through an atrous convolution layer, and it can capture multi-scale information without additional training parameter requirements. Also, the design is made compactable through the proposed hybrid optimization algorithm namely, Aquila Black Widow Optimization (ABWO), and it optimizes various hyperparameters of the developed LMResUnet++. The final denoised spectrogram from the LMResUnet++ undergoes Inverse STFT, and the final enhanced speech signal is restored. Further, different experiments are held to prove the efficacy of the system. Results prove that the developed LMResUnet++ achieved PESQ values of 7.93%, 5.75%, 3.86%, and 1.90% improved than DeepUnet, MTCNN, STCNN, and ResUnet++ respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Network (Bristol, England)

自引率

0.00%

发文量