Applied AcousticsPub Date : 2024-09-09DOI: 10.1016/j.apacoust.2024.110267
Dongzhe Zhang , Jianfeng Chen , Siwei Huang , Jisheng Bai , Yafei Jia , Mou Wang
{"title":"Synthesis-to-real robust training for enhanced sound event localization and detection using dynamic kernel convolution networks","authors":"Dongzhe Zhang , Jianfeng Chen , Siwei Huang , Jisheng Bai , Yafei Jia , Mou Wang","doi":"10.1016/j.apacoust.2024.110267","DOIUrl":"10.1016/j.apacoust.2024.110267","url":null,"abstract":"<div><p>Deep learning-based methods have shown high performance in sound event localization and detection (SELD). In real-world spatial sound environments, the presence of reverberation and the uneven distribution of different sound events increase the complexity of the SELD task. In this paper, we propose an effective SELD system in real spatial scenes. We first introduce a dynamic kernel convolution module with the convolution blocks to adaptively model the channel-wise features with different receptive fields. Secondly, we integrate two mainstream networks into the proposed SELD system with the multi-track activity-coupled Cartesian direction of arrival (ACCDOA). Moreover, two synthesis-to-real robust training strategies are introduced into the training stage to improve the system's generalization in realistic spatial sound scenes. Finally, we use data augmentation methods to extend the dataset using channel rotation, and spatial data synthesis. Four joint metrics are used to evaluate the performance of the SELD system on the Sony-TAu Realistic Spatial Soundscapes dataset. Experimental results show that the proposed systems outperform the fixed-kernel convolution SELD systems. In addition, the ensemble system achieves a SELD score of 0.348 in the DCASE SELD task and outperforms the SOTA methods.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-09DOI: 10.1016/j.apacoust.2024.110270
Zhihui Shen , Ming Li , Saiyin Fang , Xu Ning , Feilong Mao , Gezhou Qin , Yue Zhao , Jialong Zhao
{"title":"Wood hole quantity feature extraction and identification based on VMD-SVD of stress wave and mahalanobis distance","authors":"Zhihui Shen , Ming Li , Saiyin Fang , Xu Ning , Feilong Mao , Gezhou Qin , Yue Zhao , Jialong Zhao","doi":"10.1016/j.apacoust.2024.110270","DOIUrl":"10.1016/j.apacoust.2024.110270","url":null,"abstract":"<div><p>Aiming at the problem of wood hole defects, this paper proposed a method to identify the number of holes based on SVD. First, four holes with a diameters of 5 mm, 6 mm and 8 mm were artificially created on the specimen, and a wave source was generated on the surface of one side of the holes by pencil-lead break (PLB) tests, and a sensor was placed on the other side. Then 15 groups of signals were randomly selected for each hole case. 6-layer VMD decomposition was performed into a series of IMFs by using VMD, where the number of decomposition layers was determined based on the index of energy conservation and the index of orthogonality. SVD was performed on the matrix composed of the IMF signals to obtain the corresponding singular value row vectors, and composed as the corresponding standardized feature matrix. Finally, for the actual measured signals, the Mahalanobis distances between the eigenvectors and each standardized feature matrices were calculated separately, and the number of holes was determined based on the minimum distance. The results show that the standardized feature matrix calculated are significantly different for different numbers of holes, and the accuracy rate of identifying by calculating the Mahalanobis distance is 92 %.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-06DOI: 10.1016/j.apacoust.2024.110269
Prasansha Rastogi, Cornelis H. Venner, Claas Willem Visser, Ysbrand Wijnant
{"title":"Additive manufacturing of functionally graded foams for acoustic insulation and absorption","authors":"Prasansha Rastogi, Cornelis H. Venner, Claas Willem Visser, Ysbrand Wijnant","doi":"10.1016/j.apacoust.2024.110269","DOIUrl":"10.1016/j.apacoust.2024.110269","url":null,"abstract":"<div><p>Acoustic foams and foam-filled metamaterials excel at sound absorption but typically exhibit a low sound transmission loss (STL). Foams that precisely integrate tunable shapes, density gradients, and transitions between open-cell and closed-cell regions have the potential to simultaneously enhance absorption and STL as compared to uniform foams. However, fabrication of these materials is challenging even for small samples that consist of a few thousand unit cells. Here we show additive manufacturing of functionally graded foams via <em>direct bubble writing</em>, a method for generating and stacking bubbles into three-dimensional solid foam constructs with a throughput up to 100 ml/min. The density, pore morphology, flow resistivity, and dynamic mechanical behavior of homogeneous and graded foams are characterized. As a reference case, the STL and absorption of homogeneous samples were tested in an impedance tube for frequencies between 200 Hz and 2600 Hz. Graded samples were subsequently evaluated, revealing strongly enhanced peaks in STL (up to ∼ 68 dB) for closed-cell foams with a low-density core sandwiched between two high-density layers. A high-density core sandwiched between two low-density layers especially broadens the frequency range with high sound absorption and still enhances the STL. These results show that functionally graded closed-cell foams are a promising route towards structure-induced dissipation as required for materials that exhibit a high absorption and a high STL.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0003682X24004201/pdfft?md5=702fc13f44f19bb3b2f7e151acd17feb&pid=1-s2.0-S0003682X24004201-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142151975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-05DOI: 10.1016/j.apacoust.2024.110259
Xiaoxue Luo , Yuxuan Ke , Xiaodong Li , Chengshi Zheng
{"title":"Deep informed spatio-spectral filtering for multi-channel speech extraction against steering vector uncertainties","authors":"Xiaoxue Luo , Yuxuan Ke , Xiaodong Li , Chengshi Zheng","doi":"10.1016/j.apacoust.2024.110259","DOIUrl":"10.1016/j.apacoust.2024.110259","url":null,"abstract":"<div><p>Adaptive beamforming combined with post-filtering is one of the most widely used techniques in suppressing directional interference and environmental ambient noise, as well as reverberation. However, many adaptive beamforming methods are often relatively sensitive to the steering vector mismatch, and their performance degrades a lot for practical applications, although pioneer researchers have made great efforts on improving the robustness. To achieve better performance in challenging scenarios, this paper proposes a two-stage deep informed spatio-spectral filtering for multi-channel speech extraction, which removes interference, noise, and reverberation simultaneously when the steering vector error exists. In the first stage, a direction-informed dual-path beamforming network was introduced to extract the target directional speech with only its early reflections. To improve the robustness, an information rectification block was designed to compensate for the signal model mismatch, and the steering vector uncertainty was taken into account in the training phase. Besides, a dual-path beamforming module was adopted to reduce magnitude distortion and improve phase recovery simultaneously. In the second stage, a magnitude-phase fusion network was proposed, serving as the post-processing module to further fuse the magnitude and phase estimated by the first stage. Experimental results confirmed that the proposed method was more robust to the signal model mismatch and achieved better performance than other baseline methods in terms of speech quality and intelligibility.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142151972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-03DOI: 10.1016/j.apacoust.2024.110260
Lei Zhu , Jinglun Ma , Yue Wu , Fangfang Liu , Jian Kang
{"title":"Effects of acoustic environment on sleep and mental health in residential regions near railways","authors":"Lei Zhu , Jinglun Ma , Yue Wu , Fangfang Liu , Jian Kang","doi":"10.1016/j.apacoust.2024.110260","DOIUrl":"10.1016/j.apacoust.2024.110260","url":null,"abstract":"<div><p>Noise is an important environmental risk factor for physical and mental health. Furthermore, long-term noise exposure is burdensome for the mind and body and has become a serious problem. Rail transportation is one of the main methods used to transport goods in China; however, the noise and vibrations generated by freight trains have serious impacts on residents in nearby regions. To further investigate the relationship between railway noise and road noise exposure and changes in sleep duration and mental health scores, a field study in Harbin, China, was conducted and a health risk prediction model was constructed. The results showed that for every 1 dB increase in the Equivalent continuous A-weighted sound pressure level (LAeq), the percentage of deep sleep among residents living near the railway decreased by 0.2 %. Although residents near the railway reported similar sleep evaluations compared to those living farther away, they exhibited poorer mental health. Cox risk modeling indicates that the risk of mental health problems is approximately three times higher for those living near the railway. These findings potentially provide benefits in developing strategies to reduce the risk of mental illness for people residing near railways.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-02DOI: 10.1016/j.apacoust.2024.110265
Yan Tang , Hai Gong , Hanling Mao , Tao Zhang , Yang Li , Junlong Jin
{"title":"Numerical simulation and experimental study on nonlinear ultrasonic characterization of graphite size in nodular cast iron","authors":"Yan Tang , Hai Gong , Hanling Mao , Tao Zhang , Yang Li , Junlong Jin","doi":"10.1016/j.apacoust.2024.110265","DOIUrl":"10.1016/j.apacoust.2024.110265","url":null,"abstract":"<div><p>Graphite morphology has a significant influence on mechanical properties such as tensile strength and hardness in nodular cast iron (NCI). Compared to the detection of graphite nodularity, there is relatively less ultrasonic simulation and detection experiment research on the graphite size. Therefore, it is of great importance to study rapid nondestructive testing methods for the internal graphite size of NCI. This study establishes a simulation model of nonlinear ultrasonic penetration longitudinal wave detection for graphite size. The acoustic nonlinearity parameter (ANP) is used to characterize the graphite size. Nonlinear ultrasonic detection experiments on the internal graphite size are conducted to verify the reliability of the nonlinear ultrasonic simulation model. The simulation and experimental results show that the ANP of penetrating longitudinal waves decreases with the increase of average graphite diameter. The relationship between ANP and internal graphite size is established. Moreover, the experimental results verified the accuracy of the numerical model. The decrease in ANP may be related to the decrease in the number of grain boundaries. Therefore, the nonlinear ultrasonic technique is an effective method for characterizing the internal graphite size. The establishment of a nonlinear ultrasonic simulation model for graphite size in NCI lays the research foundation for nonlinear ultrasonic microstructure characterization experiments.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142129291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-02DOI: 10.1016/j.apacoust.2024.110258
Jiawei Wang , Hongqing Liu , Shuaiyi Han , Guohua Sun , Xiaoqing Hu
{"title":"Microphone array post-filter based on accurate estimation of noise power spectral density","authors":"Jiawei Wang , Hongqing Liu , Shuaiyi Han , Guohua Sun , Xiaoqing Hu","doi":"10.1016/j.apacoust.2024.110258","DOIUrl":"10.1016/j.apacoust.2024.110258","url":null,"abstract":"<div><p>The conventional post-filtering methods in the multi-channel noise reduction application only rely on multi-channel input signals while neglecting the noise reduction capability of the microphone array beamformer, which results in overestimation of the noise power spectral density (PSD) and consequently suboptimal filters in the minimum mean-square-error (MMSE) sense. This paper proposes a novel microphone array post-filter based on accurate PSD estimation, the beamformer output in the microphone array is used to estimate the noise PSD, and a two-step noise reduction method is also employed to obtain accurate post-filter gain function. The error analysis is also given to highlight the advantage of the proposed algorithm over the conventional Zelinski and McCowan post-filters. The performance advantages of the proposed post-filter are demonstrated in terms of segmental SNR (SegSNR), short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), and deep noise suppression mean opinion score (DNSMOS).</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Three-dimensional grid-free sound source localization method based on deep learning","authors":"Yunjie Zhao, Yansong He, Hao Chen, Zhifei Zhang, Zhongming Xu","doi":"10.1016/j.apacoust.2024.110261","DOIUrl":"10.1016/j.apacoust.2024.110261","url":null,"abstract":"<div><p>Sound source localization (SSL) technology is a popular method for identifying the locations of noise sources, which serves as a prerequisite for noise control. Deep learning, as a data-driven tool, shows broad perspectives in the field of SSL with its powerful nonlinear fitting ability. The existing deep learning-based SSL methods only provide a two-dimensional (2D) representation of the sound source location and cannot obtain the specific coordinates of the sound source in three-dimensional (3D) space. Although traditional beamforming methods can be directly generalized to 3D scenes in principle, they suffer from the limitations of insufficient vertical resolution and high computational cost. Therefore, a 3D grid-free SSL method (3DGF) informed by deep learning is suggested in this study to enhance the accuracy and computational efficiency of 3D localization. First, the number of data channels is compressed to respect limited memory resources during the training process. Subsequently, a dense convolutional neural network (DenseNet) model is utilized to obtain the 3D spatial coordinates of the sound source using the processed 3D beamforming map as input. Since the coordinates are continuous and are not constrained by the grid of the beamforming map, the grid-free strategy presents more accurate localization results. Then, the effects of the volume of training data and the compression ratio are analyzed, respectively, in simulation, and the localization performance with different signal-to-noise ratios (SNRs) is also tested. Finally, by comparing 3DGF with DAMAS, both simulation and experimental results demonstrate that 3DGF improves the accuracy and efficacy of 3D localization. Meanwhile, its satisfactory generalization ability and robustness against noise highlight its potential for practical applications.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-09-01DOI: 10.1016/j.apacoust.2024.110264
Luming Li , Mingyong Zhou , Lei Huang , Kai Luo , Bingyan Jiang
{"title":"Detachable holographic acoustofluidic chip for striped acoustic field modulation and particle manipulation","authors":"Luming Li , Mingyong Zhou , Lei Huang , Kai Luo , Bingyan Jiang","doi":"10.1016/j.apacoust.2024.110264","DOIUrl":"10.1016/j.apacoust.2024.110264","url":null,"abstract":"<div><p>The development of detachable acoustofluidic devices is of great significance for disposable and cost-effective biological and chemical analysis. In this work, a highly integrated holographic acoustofluidic device based on acoustic holography and microfluidic chips was proposed to realize the modulation of striped acoustic field in microchannels. In the device, the chip is disposable and the transducer is reused. The acoustic hologram was fabricated by injection molding for efficient manufacturing and low cost. In addition, a multiphysics simulation model for holographic acoustofluidic chip was established to analyze the effect of acoustic field modulation and particle manipulation. Results showed that the acoustic pressure inside the microchannel of the device exhibits a clear striped distribution, and a linear arrangement of particles parallel and inclined to the extension direction of the channel wall can be achieved within 2 s. The distance between the arrangement lines in the target region was controlled at around 60 μm. The investigation of thermal effect validates the biocompatibility. The designed holographic acoustofluidic device presents a promising option for the manipulation, arrangement, and sorting of cells and other particles in microchannels.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Applied AcousticsPub Date : 2024-08-31DOI: 10.1016/j.apacoust.2024.110253
Zhuo Xue , Dan He , ZeXing Ni , Xiufeng Wang
{"title":"Morphological dictionary learning based sparse classification for small electric motor state recognition under unbalanced samples","authors":"Zhuo Xue , Dan He , ZeXing Ni , Xiufeng Wang","doi":"10.1016/j.apacoust.2024.110253","DOIUrl":"10.1016/j.apacoust.2024.110253","url":null,"abstract":"<div><p>Accurately recognizing sound states in the production line of small electric motors is of great importance for manufacturers to carry out quick repairs and ensure high quality deliveries. Since the number of normal samples is much larger than the number of abnormal samples in practice, resulting in unbalanced data, which poses huge challenges to traditional detection methods. To overcome these difficulties, this study presents a morphological dictionary learning-based sparse classification (MDL-SC) combined with audio data augmentation method for small electric motor state recognition under unbalanced samples. Firstly, audio data augmentation methods such as adding background noise, pitch shifting, time stretching and combined augmentation are investigated for augmenting the number and diversity of samples. Secondly, morphological dictionary learning is proposed for characterizing transient sounds of small electric motors and enhancing the discriminative feature learning capability of the dictionary. Finally, the minimum reconstruction error strategy is relied upon to establish automatic recognition of small electric motor states. Three small motor datasets with unbalanced ratios are established in the experiments to verify the effectiveness of the proposed MDL-SC, which has higher recognition accuracy under unbalanced conditions compared with traditional dictionary learning based sparse classification (DL-SC), k-nearest neighbors, support vector machines and convolutional neural networks. This study can provide some theoretical implications for the later development of online detection of small electric motors or other types of electric motors.</p></div>","PeriodicalId":55506,"journal":{"name":"Applied Acoustics","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142095865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}