基于两阶段混合神经网络的量子辅助语音增强

IF 3.4 2区物理与天体物理 Q1 ACOUSTICS

Applied Acoustics Pub Date : 2025-05-09 DOI:10.1016/j.apacoust.2025.110792

Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia

{"title":"基于两阶段混合神经网络的量子辅助语音增强","authors":"Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia","doi":"10.1016/j.apacoust.2025.110792","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"238 ","pages":"Article 110792"},"PeriodicalIF":3.4000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantum-assisted speech enhancement via a two-stage hybrid neural network\",\"authors\":\"Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia\",\"doi\":\"10.1016/j.apacoust.2025.110792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.</div></div>\",\"PeriodicalId\":55506,\"journal\":{\"name\":\"Applied Acoustics\",\"volume\":\"238 \",\"pages\":\"Article 110792\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Acoustics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0003682X25002646\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X25002646","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

摘要

传统的深度神经网络（dnn）需要大量的计算资源来进行单通道语音增强，而量子计算机受限于有限的量子比特资源。为了解决这些挑战，本研究引入了HQCSE（混合量子经典语音增强），这是一种混合量子经典框架，针对低量子位使用量的近期量子计算机进行了优化。HQCSE结合了创新的优化策略和一种新的损失函数，减轻了传统均方误差（MSE）对异常值的敏感性，从而增强了模型的鲁棒性。一种新的可学习量子态嵌入方法与先进的特征预处理相结合，显著提高了经典数据到量子态映射的效率。该框架的两阶段网络设计在预处理阶段采用量子神经网络，减少了对LSTM等经典架构的依赖，并在后处理阶段与LSTM协同，增强信息互补性。在TIMIT+和VoiceBank+Demand数据集上的实验表明，HQCSE仅使用基线模型的一半参数就能获得具有竞争力的性能，并且在未见过的数据集上表现出优越的鲁棒性。此外，量子抗噪声实验证实了该框架对量子电路噪声的鲁棒性，突出了其在语音增强应用中推进量子机器学习的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Quantum-assisted speech enhancement via a two-stage hybrid neural network

查看原文本刊更多论文

Quantum-assisted speech enhancement via a two-stage hybrid neural network

Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Acoustics 物理-声学

CiteScore

7.40

自引率

11.80%

发文量

618

审稿时长

7.5 months

期刊介绍： Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense. Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems. Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.