Shih-Hau Fang;Po-Han Li;Hau-Hsiang Jung;Syu-Siang Wang
{"title":"Geometrically Informed Graph Neural Networks for Distributed Speech Enhancement in Challenging Acoustic Environments","authors":"Shih-Hau Fang;Po-Han Li;Hau-Hsiang Jung;Syu-Siang Wang","doi":"10.1109/LSENS.2025.3564309","DOIUrl":null,"url":null,"abstract":"Speech signals face challenges in the presence of ambient sounds, encompassing factors, such as reverberation and diffuse noise, leading to compromises in clarity and intelligibility. Taking inspiration from the effectiveness of human binaural hearing, researchers have delved into the exploration of distributed speech enhancement (DSE) processors on the distributed microphone system. Despite the achievements of previous approaches, challenges remain, especially in reducing noise from speech signals and improving model interpretability. This letter introduces an innovative geometrically informed graph neural network (GIGNN) designed for DSE tasks. The distinct advantage of GIGNN lies in the capability of graph neural networks (GNNs) to visualize structured data, providing effective in parameterizing a wide array of spatiotemporal interactions and accordingly enhancing the model interpretability. In addition, we assess the effectiveness of geometrically informed spatial matrices within GNNs in our evaluation. Experimental validation in varying signal-to-noise ratios in real-life scenarios underscores the potential of GIGNN.","PeriodicalId":13014,"journal":{"name":"IEEE Sensors Letters","volume":"9 6","pages":"1-4"},"PeriodicalIF":2.2000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10978873/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Speech signals face challenges in the presence of ambient sounds, encompassing factors, such as reverberation and diffuse noise, leading to compromises in clarity and intelligibility. Taking inspiration from the effectiveness of human binaural hearing, researchers have delved into the exploration of distributed speech enhancement (DSE) processors on the distributed microphone system. Despite the achievements of previous approaches, challenges remain, especially in reducing noise from speech signals and improving model interpretability. This letter introduces an innovative geometrically informed graph neural network (GIGNN) designed for DSE tasks. The distinct advantage of GIGNN lies in the capability of graph neural networks (GNNs) to visualize structured data, providing effective in parameterizing a wide array of spatiotemporal interactions and accordingly enhancing the model interpretability. In addition, we assess the effectiveness of geometrically informed spatial matrices within GNNs in our evaluation. Experimental validation in varying signal-to-noise ratios in real-life scenarios underscores the potential of GIGNN.