{"title":"Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network","authors":"J. Han, M. Kim, H. S. Kim","doi":"10.7776/ASK.2020.39.1.064","DOIUrl":"https://doi.org/10.7776/ASK.2020.39.1.064","url":null,"abstract":"In a conventional speech recognition system using Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), the cepstral feature normalization method based on pole filtering was effective in improving the performance of recognition of short utterances in noisy environments. In this paper, the usefulness of this method for the state-of-the-art speech recognition system using Deep Neural Network (DNN) is examined. Experimental results on AURORA 2 DB show that the cepstral mean and variance normalization based on pole filtering improves the recognition performance of very short utterances compared to that without pole filtering, especially when there is a large mismatch between the training and test conditions.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71370562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noise evaluation method of DC motor according to change of load","authors":"차수호, Shin Sung-Hwan","doi":"10.7776/ASK.2020.39.2.113","DOIUrl":"https://doi.org/10.7776/ASK.2020.39.2.113","url":null,"abstract":": M otor noise is a major concern in order to improve perceptual feeling of car interior sound due to increased motor usage in passenger cars. The purpose of this study is to propose factors that can represent the acoustic performance of motor noise according to the change of load. To this end, at first, it is shown that power spectrum and total loudness are not fit for noise performance, and then, , partial loudness related to the brush friction component, and , partial loudness related to the torque ripple component are investigated as factors representing motor noise. The performance curve of motor noise using and is proposed to identify trends of motor noise according to the loads. The curve could be a guide for the noise control, the selection of motor, and the improvement of a system.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71370678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-band multi-scale DenseNet with dilated convolution for background music separation","authors":"Woon-Haeng Heo, Hyemi Kim, O. Kwon","doi":"10.7776/ASK.2019.38.6.697","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.6.697","url":null,"abstract":"We propose a multi-band multi-scale DenseNet with dilated convolution that separates background music signals from broadcast content. Dilated convolution can learn the multi-scale context information represented by spectrogram. In computer simulation experiments, the proposed architecture is shown to improve Signal to Distortion Ratio (SDR) by 0.15 dB and 0.27 dB in 0dB and –10 dB Signal to Noise Ratio (SNR) environments, respectively.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42216642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance","authors":"C. Kao, Hanseok Ko","doi":"10.7776/ASK.2019.38.6.670","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.6.670","url":null,"abstract":"As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47244931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donghyeon Kim, Heejin Park, Jeongyup Kim, Joung-Soo Park, Jooyoung Hahn
{"title":"Performance analysis of underwater acoustic communication based on beam diversity in deep water","authors":"Donghyeon Kim, Heejin Park, Jeongyup Kim, Joung-Soo Park, Jooyoung Hahn","doi":"10.7776/ASK.2019.38.6.678","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.6.678","url":null,"abstract":"Underwater communication performance is degraded by the influence of Inter-Symbol Interference (ISI) due to multipath. Passive time reversal processing is the most effective technique for mitigating multipath, and the diversity combining method can be used to improve its performance. This paper analyzed communication performance using the beam diversity combining method, which combines signals obtained through the beam steering to various angles. Directions of arrival were estimated through the beam-time migration, which, in turn, was estimated from probe signals received by a vertical line array. The performance was analyzed based on the number and type of combinations among the estimated angles. In this paper, the data obtained from the Biomimetic Long range Acoustic Communications 2018 (BLAC18) experiment, which was conducted in the East sea, ~50 km east of Pohang, in October 2018, were used for the analysis. The output Signal to Noise Ratio (SNR) was used as communication indicators.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48614123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance comparison of lung sound classification using various convolutional neural networks","authors":"Gee Yeun Kim, Hyoung‐Gook Kim","doi":"10.7776/ASK.2019.38.5.568","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.5.568","url":null,"abstract":"In the diagnosis of pulmonary diseases, auscultation technique is simpler than the other methods, and lung sounds can be used for predicting the types of pulmonary diseases as well as identifying patients with pulmonary diseases. Therefore, in this paper, we identify patients with pulmonary diseases and classify lung sounds according to their sound characteristics using various convolutional neural networks, and compare the classification performance of each neural network method. First, lung sounds over affected areas of the chest with pulmonary diseases are collected by using a single-channel lung sound recording device, and spectral features are extracted from the collected sounds in time domain and applied to each neural network. As classification methods, we use general, parallel, and residual convolutional neural network, and compare lung sound classification performance of each neural network through experiments.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41540191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image enhancement in ultrasound passive cavitation imaging using centroid and flatness of received channel data","authors":"M. Jeong, S. Kwon, M. Choi","doi":"10.7776/ASK.2019.38.4.450","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.4.450","url":null,"abstract":"Passive cavitation imaging method is used to observe the ultrasonic waves generated when a group of bubbles collapses. A problem with passive cavitation imaging is a low resolution and large side lobe levels. Since ultrasound signals generated by passive cavitation take the form of a pulse, the amplitude distribution of signals received across the receive channels varies depending on the direction of incidence. Both the centroid and flatness were calculated to determine weights at imaging points in order to discriminate between the main and side lobe signals from the signal amplitude distribution of the received channel data and to reduce the side lobe levels. The centroid quantifies how the channel data are distributed across the receive channel, and the flatness measures the variance of the channel data. We applied the centroid weight and the flatness to the passive cavitation image constructed using the delay-and-sum focusing and minimum variance beamforming methods to improve the image quality. Using computer simulation and experiment, we show that the application of weighting in delay-and-sum and minimum variance beamforming reduces side lobe levels.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45684746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of automatic target tracking algorithms based on analysis of sea trial data in diver detection sonar","authors":"H. Lee, Sung-Chur Kwon, W. Oh, K. Shin","doi":"10.7776/ASK.2019.38.4.415","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.4.415","url":null,"abstract":"In this paper, we discussed automatic target tracking algorithms for diver detection sonar that observes penetration forces of coastal military installations and major infrastructures. First of all, we analyzed sea trial data in diver detection sonar and composed automatic target tracking algorithms based on track existence probability as track quality measure in clutter environment. In particular, these are presented track management algorithms which include track initiation, confirmation, termination, merging and target tracking algorithms which include single target tracking IPDAF (Integrated Probabilistic Data Association Filter) and multitarget tracking LMIPDAF (Linear Multi-target Integrated Probabilistic Data Association Filter). And we analyzed performances of automatic target tracking algorithms using sea trial data and monte carlo simulation data.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48921932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Range estimation of underwater moving source using frequency-difference-of-arrival of multipath signals","authors":"W. Park, Ki-Man Kim, Y. Son","doi":"10.7776/ASK.2019.38.2.154","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.2.154","url":null,"abstract":": When measuring the radiating noise of an underwater moving source, the range information between the acoustic source and the receiver is an important evaluation factor, and the measurement standards such as a receiver position, a moving source depth and a speed are set. Although there is a method of using the cross correlation as a method of finding the range of the underwater moving source, this method requires a time synchronization process. In this paper, we proposed the method to estimate the range by comparing the Doppler frequency difference of the theoretically calculated multipath signal with the Doppler frequency difference of the multipath signal estimated from the received signal. The proposed method does not require a separate time synchronization process. Simulations were performed to verify the performance, and the ranging error of the proposed method reduced by about 95 % than that of the conventional method.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71370275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ki-Jun Son, Dong-Jin Park, Min-Jae Kim, Y. Oh, S. Park
{"title":"A study on the performance verification of an around-view sonar and an excavation depth measurement sonar application to ROV for track-based heavy works","authors":"Ki-Jun Son, Dong-Jin Park, Min-Jae Kim, Y. Oh, S. Park","doi":"10.7776/ASK.2019.38.2.161","DOIUrl":"https://doi.org/10.7776/ASK.2019.38.2.161","url":null,"abstract":"In this paper, the performance verification of an around-view sonar and an excavation depth measuring sonar applicable to track-based ROVs (Remotely Operated underwater Vehicles) for heavy duty work is studied. For the performance verification, an experiment is carried out in a water tank and at sea by attaching the around-view sonar and the excavation depth measuring sonar for a heavy work ROV. In the case of the around-view sonar, image sonars are mounted on ROV in four directions (front, back, left and right) and in the case of the excavation depth measuring sonar, the same kind of MBES (Multi Beam Echo Sounder) is mounted on the front of the ROV. The result of an operation test of the ROV equipped with these sonars shows that the sonar systems are rarely affected by high turbidity due to sedimentation during the operation. In the case of the around-view sonar, it is possible to see rock formation, gravel and sandbank 30 m ahead of the ROV. It is confirmed that the excavation depth can be measured after the ROV has performed the excavation. This experiment demonstrates that the ROV can improve the efficiency of the work by utilizing the around-view sonar and the excavation depth measuring sonar.","PeriodicalId":42689,"journal":{"name":"Journal of the Acoustical Society of Korea","volume":null,"pages":null},"PeriodicalIF":0.4,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71370280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}