Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia
{"title":"基于两阶段混合神经网络的量子辅助语音增强","authors":"Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia","doi":"10.1016/j.apacoust.2025.110792","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.</div></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":"238 ","pages":"Article 110792"},"PeriodicalIF":3.4000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantum-assisted speech enhancement via a two-stage hybrid neural network\",\"authors\":\"Jicheng Yan , Ri-gui Zhou , Wenshan Xu , Yaochong Li , Xue Yang , Shizheng Jia\",\"doi\":\"10.1016/j.apacoust.2025.110792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.</div></div>\",\"PeriodicalId\":55506,\"journal\":{\"name\":\"Applied Acoustics\",\"volume\":\"238 \",\"pages\":\"Article 110792\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Acoustics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0003682X25002646\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Acoustics","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0003682X25002646","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
Quantum-assisted speech enhancement via a two-stage hybrid neural network
Traditional deep neural networks (DNNs) demand substantial computational resources for single-channel speech enhancement, while quantum computers are constrained by limited qubit resources. To address these challenges, this study introduces HQCSE (Hybrid quantum-classical speech enhancement), a hybrid quantum-classical framework optimized for near-term quantum computers with low qubit usage. HQCSE incorporates innovative optimization strategies and a novel loss function that mitigates the sensitivity of traditional mean squared error (MSE) to outliers, thereby enhancing model robustness. A novel learnable quantum state embedding method, complemented by advanced feature preprocessing, significantly improves the efficiency of mapping classical data to quantum states compared to previous approaches. The framework's two-stage network design employs quantum neural networks in the preprocessing stage, reducing reliance on classical architectures like LSTM, and collaborates with LSTM in the post-processing stage to enhance information complementarity. Experiments on the TIMIT+ and VoiceBank+Demand datasets demonstrate that HQCSE achieves competitive performance with only half the parameters of baseline models and exhibits superior robustness on unseen datasets. Additionally, quantum noise resistance experiments confirm the framework's robustness against quantum circuit noise, highlighting its potential to advance quantum machine learning in speech enhancement applications.
期刊介绍:
Since its launch in 1968, Applied Acoustics has been publishing high quality research papers providing state-of-the-art coverage of research findings for engineers and scientists involved in applications of acoustics in the widest sense.
Applied Acoustics looks not only at recent developments in the understanding of acoustics but also at ways of exploiting that understanding. The Journal aims to encourage the exchange of practical experience through publication and in so doing creates a fund of technological information that can be used for solving related problems. The presentation of information in graphical or tabular form is especially encouraged. If a report of a mathematical development is a necessary part of a paper it is important to ensure that it is there only as an integral part of a practical solution to a problem and is supported by data. Applied Acoustics encourages the exchange of practical experience in the following ways: • Complete Papers • Short Technical Notes • Review Articles; and thereby provides a wealth of technological information that can be used to solve related problems.
Manuscripts that address all fields of applications of acoustics ranging from medicine and NDT to the environment and buildings are welcome.