ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Fire Detection in H.264 Compressed Video H.264压缩视频中的火灾检测
Murat Muhammet Savci, Yasin Yildirim, Gorkem Saygili, B. U. Töreyin
{"title":"Fire Detection in H.264 Compressed Video","authors":"Murat Muhammet Savci, Yasin Yildirim, Gorkem Saygili, B. U. Töreyin","doi":"10.1109/ICASSP.2019.8683666","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683666","url":null,"abstract":"In this paper, we propose a compressed domain fire detection algorithm using macroblock types and Markov Model in H.264 video. Compressed domain method does not require decoding to pixel domain, instead a syntax parser extracts syntax elements which are only available in compressed domain. Our method extracts only macroblock type and corresponding macroblock address information. Markov model with fire and non-fire models are evaluated using offline-trained data. Our experiments show that the algorithm is able to detect and identify fire event in compressed domain successfully, despite a small chunk of data is used in the process.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"289 1","pages":"8310-8314"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84153498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Passive Detection and Discrimination of Body Movements in the sub-THz Band: A Case Study 亚太赫兹波段身体运动的被动检测和识别:一个案例研究
S. Kianoush, S. Savazzi, V. Rampa
{"title":"Passive Detection and Discrimination of Body Movements in the sub-THz Band: A Case Study","authors":"S. Kianoush, S. Savazzi, V. Rampa","doi":"10.1109/ICASSP.2019.8682165","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682165","url":null,"abstract":"Passive radio sensing technique is a well established research topic where radio-frequency (RF) devices are used as real-time virtual probes that are able to detect the presence and the movement(s) of one or more (non instrumented) subjects. However, radio sensing methods usually employ frequencies in the unlicensed 2.4−5.0 GHz bands where multipath effects strongly limit their accuracy, thus reducing their wide acceptance. On the contrary, sub-terahertz (sub-THz) radiation, due to its very short wavelength and reduced multipath effects, is well suited for high-resolution body occupancy detection and vision applications. In this paper, for the first time, we adopt radio devices emitting in the 100 GHz band to process an image of the environment for body motion discrimination inside a workspace area. Movement detection is based on the real-time analysis of body-induced signatures that are estimated from sub-THz measurements and then processed by specific neural network-based classifiers. Experimental trials are employed to validate the proposed methods and compare their performances with application to industrial safety monitoring.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"5 1","pages":"1597-1601"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84182799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Learning the Spiral Sharing Network with Minimum Salient Region Regression for Saliency Detection 学习最小显著区回归的螺旋共享网络显著性检测
Zukai Chen, Xin Tan, Hengliang Zhu, Shouhong Ding, Lizhuang Ma, Haichuan Song
{"title":"Learning the Spiral Sharing Network with Minimum Salient Region Regression for Saliency Detection","authors":"Zukai Chen, Xin Tan, Hengliang Zhu, Shouhong Ding, Lizhuang Ma, Haichuan Song","doi":"10.1109/ICASSP.2019.8682531","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682531","url":null,"abstract":"With the development of convolutional neural networks (CNNs), saliency detection methods have made a big progress in recent years. However, the previous methods sometimes mistakenly highlight the non-salient region, especially in complex backgrounds. To solve this problem, a two-stage method for saliency detection is proposed in this paper. In the first stage, a network is used to regress the minimum salient region (RMSR) containing all salient objects. Then in the second stage, in order to fuse the multi-level features, the spiral sharing network (SSN) is proposed for pixel-level detection on the result of RMSR. Experimental results on four public datasets show that our model is effective over the state-of-the-art approaches.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"125 1","pages":"1667-1671"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72836600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autoencoding HRTFS for DNN Based HRTF Personalization Using Anthropometric Features 自动编码HRTFS基于DNN的HRTF个性化使用人体特征
Tzu-Yu Chen, Tzu-Hsuan Kuo, T. Chi
{"title":"Autoencoding HRTFS for DNN Based HRTF Personalization Using Anthropometric Features","authors":"Tzu-Yu Chen, Tzu-Hsuan Kuo, T. Chi","doi":"10.1109/ICASSP.2019.8683814","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683814","url":null,"abstract":"We proposed a deep neural network (DNN) based approach to synthesize the magnitude of personalized head-related transfer functions (HRTFs) using anthropometric features of the user. To mitigate the over-fitting problem when training dataset is not very large, we built an autoencoder for dimensional reduction and establishing a crucial feature set to represent the raw HRTFs. Then we combined the decoder part of the autoencoder with a smaller DNN to synthesize the magnitude HRTFs. In this way, the complexity of the neural networks was greatly reduced to prevent unstable results with large variance due to overfitting. The proposed approach was compared with a baseline DNN model with no autoencoder. The log-spectral distortion (LSD) metric was used to evaluate the performance. Experiment results show that the proposed approach can reduce LSD of estimated HRTFs with greater stability.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"36 1","pages":"271-275"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76405984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Multicast Beamforming Using Semidefinite Relaxation and Bounded Perturbation Resilience 基于半定松弛和有界扰动弹性的组播波束形成
Jochen Fink, R. Cavalcante, S. Stańczak
{"title":"Multicast Beamforming Using Semidefinite Relaxation and Bounded Perturbation Resilience","authors":"Jochen Fink, R. Cavalcante, S. Stańczak","doi":"10.1109/ICASSP.2019.8682325","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682325","url":null,"abstract":"Semidefinite relaxation followed by randomization is a well-known approach for approximating a solution to the NP-hard max-min fair multicast beamforming problem. While providing a good approximation to the optimal solution, this approach commonly involves the use of computationally demanding interior point methods. In this study, we propose a solution based on superiorization of bounded perturbation resilient iterative operators that scales to systems with a large number of antennas. We show that this method outperforms the randomization techniques in many cases, while using only computationally simple operations.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"75 1","pages":"4749-4753"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87693878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
1-D Convolutional Neural Networks for Signal Processing Applications 一维卷积神经网络在信号处理中的应用
S. Kiranyaz, T. Ince, Osama Abdeljaber, Onur Avcı, M. Gabbouj
{"title":"1-D Convolutional Neural Networks for Signal Processing Applications","authors":"S. Kiranyaz, T. Ince, Osama Abdeljaber, Onur Avcı, M. Gabbouj","doi":"10.1109/ICASSP.2019.8682194","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682194","url":null,"abstract":"1D Convolutional Neural Networks (CNNs) have recently become the state-of-the-art technique for crucial signal processing applications such as patient-specific ECG classification, structural health monitoring, anomaly detection in power electronics circuitry and motor-fault detection. This is an expected outcome as there are numerous advantages of using an adaptive and compact 1D CNN instead of a conventional (2D) deep counterparts. First of all, compact 1D CNNs can be efficiently trained with a limited dataset of 1D signals while the 2D deep CNNs, besides requiring 1D to 2D data transformation, usually need datasets with massive size, e.g., in the \"Big Data\" scale in order to prevent the well-known \"overfitting\" problem. 1D CNNs can directly be applied to the raw signal (e.g., current, voltage, vibration, etc.) without requiring any pre- or post-processing such as feature extraction, selection, dimension reduction, denoising, etc. Furthermore, due to the simple and compact configuration of such adaptive 1D CNNs that perform only linear 1D convolutions (scalar multiplications and additions), a real-time and low-cost hardware implementation is feasible. This paper reviews the major signal processing applications of compact 1D CNNs with a brief theoretical background. We will present their state-of-the-art performances and conclude with focusing on some major properties. Keywords – 1-D CNNs, Biomedical Signal Processing, SHM","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"8 1","pages":"8360-8364"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87978515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 145
An Empirical Study of Speech Processing in the Brain by Analyzing the Temporal Syllable Structure in Speech-input Induced EEG 通过分析语音输入诱发脑电图的时态音节结构对大脑语音处理的实证研究
Rini A. Sharon, Shrikanth S. Narayanan, M. Sur, H. Murthy
{"title":"An Empirical Study of Speech Processing in the Brain by Analyzing the Temporal Syllable Structure in Speech-input Induced EEG","authors":"Rini A. Sharon, Shrikanth S. Narayanan, M. Sur, H. Murthy","doi":"10.1109/ICASSP.2019.8683572","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683572","url":null,"abstract":"Clinical applicability of electroencephalography (EEG) is well established, however the use of EEG as a choice for constructing brain computer interfaces to develop communication platforms is relatively recent. To provide more natural means of communication, there is an increasing focus on bringing together speech and EEG signal processing. Quantifying the way our brain processes speech is one way of approaching the problem of speech recognition using brain waves. This paper analyses the feasibility of recognizing syllable level units by studying the temporal structure of speech reflected in the EEG signals. The slowly varying component of the delta band EEG(0.3-3Hz) is present in all other EEG frequency bands. Analysis shows that removing the delta trend in EEG signals results in signals that reveals syllable like structure. Using a 25 syllable framework, classification of EEG data obtained from 13 subjects yields promising results, underscoring the potential of revealing speech related temporal structure in EEG.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"33 1","pages":"4090-4094"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86300490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
CRF-based Single-stage Acoustic Modeling with CTC Topology 基于crf的CTC拓扑单级声学建模
Hongyu Xiang, Zhijian Ou
{"title":"CRF-based Single-stage Acoustic Modeling with CTC Topology","authors":"Hongyu Xiang, Zhijian Ou","doi":"10.1109/ICASSP.2019.8682256","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682256","url":null,"abstract":"In this paper, we develop conditional random field (CRF) based single-stage (SS) acoustic modeling with connectionist temporal classification (CTC) inspired state topology, which is called CTC-CRF for short. CTC-CRF is conceptually simple, which basically implements a CRF layer on top of features generated by the bottom neural network with the special state topology. Like SS-LF-MMI (lattice-free maximum-mutual-information), CTC-CRFs can be trained from scratch (flat-start), eliminating GMM-HMM pre-training and tree-building. Evaluation experiments are conducted on the WSJ, Switchboard and Librispeech datasets. In a head-to-head comparison, the CTC-CRF model using simple Bidirectional LSTMs consistently outperforms the strong SS-LF-MMI, across all the three benchmarking datasets and in both cases of mono-phones and mono-chars. Additionally, CTC-CRFs avoid some ad-hoc operation in SS-LF-MMI.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"48 1","pages":"5676-5680"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86079634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Enhanced Virtual Singers Generation by Incorporating Singing Dynamics to Personalized Text-to-speech-to-singing 增强虚拟歌手的产生,结合唱歌动态个性化的文本到语音到唱歌
Kantapon Kaewtip, F. Villavicencio, Fang-Yu Kuo, Mark Harvilla, I. Ouyang, P. Lanchantin
{"title":"Enhanced Virtual Singers Generation by Incorporating Singing Dynamics to Personalized Text-to-speech-to-singing","authors":"Kantapon Kaewtip, F. Villavicencio, Fang-Yu Kuo, Mark Harvilla, I. Ouyang, P. Lanchantin","doi":"10.1109/ICASSP.2019.8682968","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8682968","url":null,"abstract":"We present in this work a strategy to enhance the quality of Text-to-Speech (TTS) based Singing Voice generation. Speech-to-singing refers to techniques transforming a spoken voice into singing, mainly by manipulating the duration and pitch of a spoken version of a song’s lyrics. While this strategy efficiently preserves the speaker identity, the generated singing is not always perceived fully natural since the vocal conditions generally change between spoken and singing voice. By incorporating speaker-independent natural singing information to TTS-based Speech-to-Singing (STS) we positively impact the sound quality (e.g. reducing hoarseness), as it is shown in the subjective evaluation reported at the end of this paper.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"22 1","pages":"6960-6964"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82804455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Tensor Super-resolution for Seismic Data 地震数据的张量超分辨率
Songjie Liao, Xiao-Yang Liu, Feng Qian, Miao Yin, Guangmin Hu
{"title":"Tensor Super-resolution for Seismic Data","authors":"Songjie Liao, Xiao-Yang Liu, Feng Qian, Miao Yin, Guangmin Hu","doi":"10.1109/ICASSP.2019.8683419","DOIUrl":"https://doi.org/10.1109/ICASSP.2019.8683419","url":null,"abstract":"In this paper, we propose a novel method for generating high-granularity three-dimensional (3D) seismic data from low-granularity data based on tensor sparse coding, which jointly trains a high-granularity dictionary and a low-granularity dictionary. First, considering the high-dimensional properties of seismic data, we introduce tensor sparse coding to seismic data interpolation. Second, we propose that the dictionary pairs trained by low-granularity seismic data and high-granularity seismic data have the same sparse representation, which are used to recover high-granularity data with the high-granularity dictionary. Finally, experiments on the seismic data of an actual field show that the proposed method effectively perform seismic trace interpolation and can improve the resolution of seismic data imaging.","PeriodicalId":13203,"journal":{"name":"ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"12 1","pages":"8598-8602"},"PeriodicalIF":0.0,"publicationDate":"2019-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83288224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信