2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)最新文献

筛选
英文 中文
A Comparison of Boosted Deep Neural Networks for Voice Activity Detection 增强深度神经网络语音活动检测的比较
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969258
Harshit Krishnakumar, D. Williamson
{"title":"A Comparison of Boosted Deep Neural Networks for Voice Activity Detection","authors":"Harshit Krishnakumar, D. Williamson","doi":"10.1109/GlobalSIP45357.2019.8969258","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969258","url":null,"abstract":"Voice activity detection (VAD) is an integral part of speech processing for real world problems, and a lot of work has been done to improve VAD performance. Of late, deep neural networks have been used to detect the presence of speech and this has offered tremendous gains. Unfortunately, these efforts have been either restricted to feed-forward neural networks that do not adequately capture frequency and temporal correlations, or the recurrent architectures have not been adequately tested in noisy environments. In this paper, we investigate different neural network configurations for voice activity detection. More specifically, we explore solutions that incorporate multi-resolution stacking and ensemble learning using convolutional, long short-term memory (LSTM), and dilated convolutional neural network architectures. We evaluate our approach using various speech signals that are captured in different amounts of noise. Our results show that a multi-resolution ensemble approach using LSTM recurrent neural networks performs best. This is demonstrated for seen and unseen testing scenarios.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125641173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Joint Angle And Delay Estimation (Jade) By Partial Relaxation 部分松弛的关节角和延时估计(玉)
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969218
Ahmad Bazzi
{"title":"Joint Angle And Delay Estimation (Jade) By Partial Relaxation","authors":"Ahmad Bazzi","doi":"10.1109/GlobalSIP45357.2019.8969218","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969218","url":null,"abstract":"This paper addresses the Joint Angle and Delay estimation problem of a received multi-carrier frame through multiple antennas. In particular, each source is parameterized through its complex gain, Time-of-arrival (ToA) and Angle-of-arrival (AoA). We develop three novel JADE methods in a partially relaxed model, where one relaxes the interference zone, while focusing on a certain direction at a given time. Computer simulations are given with comparison to the Cramér-Rao Bound.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128772088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Estimating Public Speaking Anxiety from Speech Signals Using Unsupervised Transfer Learning 利用无监督迁移学习从语音信号中估计公共演讲焦虑
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969502
Kexin Feng, Megha Yadav, Md. Nazmus Sakib, A. Behzadan, Theodora Chaspari
{"title":"Estimating Public Speaking Anxiety from Speech Signals Using Unsupervised Transfer Learning","authors":"Kexin Feng, Megha Yadav, Md. Nazmus Sakib, A. Behzadan, Theodora Chaspari","doi":"10.1109/GlobalSIP45357.2019.8969502","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969502","url":null,"abstract":"Public speaking anxiety (PSA) ranks as a top social phobia across the world caused by various confounding factors. Motivated by the inherent data sparsity and lack of annotations in human-related applications, we propose unsupervised learning techniques to estimate PSA from speech signals. The labeled source data come from the publicly available CREMA-D dataset, while the unlabeled target data come from real-life public speaking tasks. Since fear is one of the major factors of PSA, the goal of this study is to build fear-specific representations from the labeled source data to estimate the degree of fear in the target data, and examine the extent to which the latter is associated with anxiety during the public speaking encounter. Transfer learning is performed through the domain-adversarial neural network (DANN) and Wasserstein generative adversarial network (WGAN). Results indicate that the proposed unsupervised fear- specific estimates can detect public speaking anxiety with Pearson’s correlation coefficient of 0.28 (p <0.01). When these fear- specific estimates are combined with the degree of an individual’s preparation for the public speaking task, obtained through selfreports, they yield Pearson’s correlation of 0.55 (p <0.01). These indicate the feasibility of leveraging labeled emotion-specific corpora for detecting human-related outcomes in real-life and provides a foundation for smart assistive technologies through the automated real-time estimation of anxiety during public speaking.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117007304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Wide Separate 3D Convolution for Video Super Resolution 宽分离3D卷积视频超分辨率
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969481
Xiafei Yu, Jiying Zhao
{"title":"Wide Separate 3D Convolution for Video Super Resolution","authors":"Xiafei Yu, Jiying Zhao","doi":"10.1109/GlobalSIP45357.2019.8969481","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969481","url":null,"abstract":"Video super-resolution (VSR) aims to recover realistic high-resolution (HR) frame from its corresponding center low-resolution (LR) frame and some neighbouring supporting frames. To utilize the extra temporal information of supporting LR frames, most of VSR methods highly rely on accurate motion estimation and compensation models to align LR frames. However, the motions between frames have no ground truth, and it is difficult to train motion estimation and compensation models. Inaccurate results will lead to artifacts and blurs, which also will damage the recovery of high-resolution frames. We propose an effective separate 3D Convolution Neural Network (CNN) with wide activation to overcome the drawback of utilizing motion estimation and compensation models. Separate 3D convolution is factorizing the 3D convolution into 2D convolution along spatial domain and 1D convolution along temporal domain, which can not only capture temporal and spatial information simultaneously but also reduce the computation complexity compared to 3D CNN.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117186580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Novel Quantization Method for Deep Learning-Based Massive MIMO CSI Feedback 一种新的基于深度学习的海量MIMO CSI反馈量化方法
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969557
Tong Chen, Jiajia Guo, Shi Jin, Chao-Kai Wen, Geoffrey Y. Li
{"title":"A Novel Quantization Method for Deep Learning-Based Massive MIMO CSI Feedback","authors":"Tong Chen, Jiajia Guo, Shi Jin, Chao-Kai Wen, Geoffrey Y. Li","doi":"10.1109/GlobalSIP45357.2019.8969557","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969557","url":null,"abstract":"In massive multiple-input multiple-output (MIMO) systems, channel state information (CSI) needs feeding back to the base station (BS) by user equipment (UE) to attain the potential benefits of massive MIMO. But the large number of antennas at the BS causes a huge feedback overhead, thereby making it prohibitive to realize CSI feedback in massive MIMO. Deep leaning-based (DL) compressive sensing methods for CSI feedback can potentially reduce the overhead significantly. However, without quantization, a data-bearing bitstream for transmission cannot be produced at the UE. In this paper, we propose a novel quantization framework and training strategy for DL-based CSI feedback, which not only makes the current CSI feedback network applicable in real communication systems but also minimizes the introduced quantization distortion to improve the reconstruction quality. Experimental results demonstrate that the proposed quantization method performs well and is robust to quantization errors.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"44 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122420931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
ITD Modeling Based on Anthropometrics and KEMAR Coefficients Using Deep Neural Networks 基于人体测量学和KEMAR系数的深度神经网络过渡段建模
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969348
Saif S. Alotaibi, M. Wickert
{"title":"ITD Modeling Based on Anthropometrics and KEMAR Coefficients Using Deep Neural Networks","authors":"Saif S. Alotaibi, M. Wickert","doi":"10.1109/GlobalSIP45357.2019.8969348","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969348","url":null,"abstract":"ITD and ILD, versus source arrival direction, serve as essential binaural cues for spatial hearing. Individualized ITD and ILD can be used to render better 3D audio than a non-individualized one. Due to the correlation between ITD and some anthropometric features, machine learning, such as principal component analysis (PCA) and deep neural networks (DNNs), have become important methods to deploy individualized ITDs. The available measured ITDs do not match the exact sound source directions. An ITD correction method will be presented to overcome the irregularities that occurr due to subject head movements during database creation measurements. KEMAR’s ITD coefficients are utilized to correct the misplacement of a subject’s ITD. DNNs are used to obtain a new subject’s ITD for 1250 different azimuth and elevation angles. Mean absolute error (MAE) is used to compare the proposed ITD model with the available analytical models.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123127809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Discharge Classification in Power Electronics Applications using Machine Learning 机器学习在电力电子应用中的局部放电分类
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969553
Ebrahim Balouji, T. Hammarström, T. McKelvey
{"title":"Partial Discharge Classification in Power Electronics Applications using Machine Learning","authors":"Ebrahim Balouji, T. Hammarström, T. McKelvey","doi":"10.1109/GlobalSIP45357.2019.8969553","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969553","url":null,"abstract":"A study of machine learning (ML) methods for classification of data from partial discharges (PDs) is described. A novel set of features are suggested and tested using an extensive set of machine learning based algorithms. The aim is to classify PDs occurring within insulation systems and power electronics devices (PED). Due to the increased use of pulse width modulation waveform (PWM) in PEDs, an increased insulation degradation has been observed due to a more intense PD exposure. This study aims to develop suitable tools to detect types of defects to facilitate diagnostics as well as to improve isolation system design. To evaluate the performance of ML based classification, several algorithms have been developed to detect and classify PDs from different kind of material defects with the aim to address the reason behind the appearance of partial discharges. Experiments with different PD source locations and volume of the defect and voltage rise time were investigated on an artificial cavity test object. Relevant signal features found important are for example the maximum magnitude, duration, the distance from polarity shift, the time distance between PDs and the absolute value of the area of the detected PD waveform. It has been observed that forming such PD features based on their time occurrence results in an accurate and generalized solution. With these features the best results were achieved with the deep learning LSTM architecture reaching a test accuracy of 98.3%. For industry applications, feature engineering is useful to reduce amount of data necessary to be analyzed by the neural network or ML algorithm.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116010669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Machine Learning-Based Roadside Vehicular Traffic Localization via Opportunistic Wireless Sensing 基于机器学习的路边车辆交通定位机会无线传感
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969382
Kyle W. Mcclintick, M. Page, T. Wickramarathne, A. Wyglinski
{"title":"Machine Learning-Based Roadside Vehicular Traffic Localization via Opportunistic Wireless Sensing","authors":"Kyle W. Mcclintick, M. Page, T. Wickramarathne, A. Wyglinski","doi":"10.1109/GlobalSIP45357.2019.8969382","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969382","url":null,"abstract":"Comprehensive Situational Awareness (SA) in mixed traffic environments (i.e., both autonomous and human-operated platforms) is a critical requirement in addressing some of the challenges that hinder the deployment of autonomous vehicle (AV) systems onto roadways. In this paper, a novel framework that leverages machine learning techniques for utilizing Signals of Opportunity (SoO) for robust localization of all vehicles operating along a stretch of roadway is presented. By making use of ubiquitous wireless emissions from vehicles, the presented approach performs vehicle localization without any active participation/assistance from vehicles thus making it a suitable candidate for SA in mixed traffic environments. Our simulation results show that given the road shape and number of vehicles present, observed 2D localization estimates generated by an arbitrary algorithm whose error is described by a Gaussian bivariate distribution with 10 meters covariance yields unbiased vehicle centroid estimates with less than one meter mean squared error by eight Kalman Filter (KF) iterations. A set of KFs for each vehicle are used to leverage the filtering of multiple estimates per vehicle per filter step to reduce measurement noise by averaging, while a clustering algorithm performs the dual role of forming KF set priors and classifying location estimates to their correct vehicle.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130794023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating goal-advice appropriateness for personal financial advice 评估个人财务建议的目标建议的适当性
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969340
S. A. Chen, Adam J. Makarucha, Nebula Alam, W. Sherchan, Simon Harris, G. Yiapanis, Christopher J. Butler
{"title":"Evaluating goal-advice appropriateness for personal financial advice","authors":"S. A. Chen, Adam J. Makarucha, Nebula Alam, W. Sherchan, Simon Harris, G. Yiapanis, Christopher J. Butler","doi":"10.1109/GlobalSIP45357.2019.8969340","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969340","url":null,"abstract":"Over the years, the number of consumers seeking personal financial advisory services has grown globally. However, recent studies indicate a worrying decline in consumers’ trust and confidence in advisers and financial institutions, as well as low regulatory compliance rates. Inspiring consumer trust through increased vigilance of advice is not possible using current auditing practices as reviews are manual, time-consuming and complex. In this paper, we describe a generalised framework which leverages machine learning approaches to systematically characterise the risk status of financial advice documents prior to client delivery. We show how the framework presented provides a comprehensive, accurate and efficient compliance review of financial advice documents for financial advisers and compliance officers alike.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127011768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Speech Recognition Driven Assistive Framework for Remote Patient Monitoring 语音识别驱动的远程病人监测辅助框架
2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP) Pub Date : 2019-11-01 DOI: 10.1109/GlobalSIP45357.2019.8969464
Marc Jayson Baucas, P. Spachos
{"title":"Speech Recognition Driven Assistive Framework for Remote Patient Monitoring","authors":"Marc Jayson Baucas, P. Spachos","doi":"10.1109/GlobalSIP45357.2019.8969464","DOIUrl":"https://doi.org/10.1109/GlobalSIP45357.2019.8969464","url":null,"abstract":"Health care resources have started to become scarce due to their increase in demand. Hospitals have begun to run out of space, forcing them to deny admission of patients. Remote Patient Monitoring (RPM) has the potential to help citizens who suffer from chronic diseases and provide environments were easy to access healthcare is available. RPM allows people to receive the same amount of care without having to difficulties to find a spot at a hospital ward. However, some roadblocks end up preventing RPM from being implemented by more healthcare providers. Data integrity, user privacy, and high power consumption are some of these concerns. With data transmission and transaction, privacy and confidentiality have always been an issue. High power consumption is a concern due to RPM’s demand for continuous data collection. This paper proposes a framework that reinforces the RPM system to address these concerns. The design not only allows better data filtering for privacy but also a more responsive system with the use of controlled surveillance and speech recognition. Overall, this framework provides an opportunity for RPMs to be a viable implementation for healthcare providers.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128183508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信