2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)最新文献

Classical quadrature rules via Gaussian processes 高斯过程的经典正交规则

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-12-07 DOI: 10.1109/MLSP.2017.8168195

T. Karvonen, S. Särkkä

引用次数: 24

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR 语音增强是否能够实现端到端的ASR目标?:多通道端到端ASR的实验分析

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-12-05 DOI: 10.1109/MLSP.2017.8168188

Tsubasa Ochiai, Shinji Watanabe, S. Katagiri

{"title":"Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR","authors":"Tsubasa Ochiai, Shinji Watanabe, S. Katagiri","doi":"10.1109/MLSP.2017.8168188","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168188","url":null,"abstract":"Recently we proposed a novel multichannel end-to-end speech recognition architecture that integrates the components of multichannel speech enhancement and speech recognition into a single neural-network-based architecture and demonstrated its fundamental utility for automatic speech recognition (ASR). However, the behavior of the proposed integrated system remains insufficiently clarified. An open question is whether the speech enhancement component really gains speech enhancement (noise suppression) ability, because it is optimized based on end-to-end ASR objectives instead of speech enhancement objectives. In this paper, we solve this question by conducting systematic evaluation experiments using the CHiME-4 corpus. We first show that the integrated end-to-end architecture successfully obtains adequate speech enhancement ability that is superior to that of a conventional alternative (a delay-and-sum beamformer) by observing two signal-level measures: the signal-todistortion ratio and the perceptual evaluation of speech quality. Our findings suggest that to further increase the performances of an integrated system, we must boost the power of the latter-stage speech recognition component. However, an insufficient amount of multichannel noisy speech data is available. Based on these situations, we next investigate the effect of using a large amount of single-channel clean speech data, e.g., the WSJ corpus, for additional training of the speech recognition component. We also show that our approach with clean speech significantly improves the total performance of multichannel end-to-end architecture in the multichannel noisy ASR tasks.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"40 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76271262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Inferring room semantics using acoustic monitoring 利用声学监测推断房间语义

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168153

Muhammad A Shah, B. Raj, Khaled A. Harras

引用次数: 8

Hankel subspace method for efficient gesture representation 高效手势表示的Hankel子空间方法

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168114

B. Gatto, Anna Bogdanova, L. S. Souza, E. M. Santos

{"title":"Hankel subspace method for efficient gesture representation","authors":"B. Gatto, Anna Bogdanova, L. S. Souza, E. M. Santos","doi":"10.1109/MLSP.2017.8168114","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168114","url":null,"abstract":"Gesture recognition technology provides multiple opportunities for direct human-computer interaction, without the use of additional external devices. As such, it had been an appealing research area in the field of computer vision. Many of its challenges are related to the complexity of human gestures, which may produce nonlinear distributions under different viewpoints. In this paper, we introduce a novel framework for gesture recognition, which achieves high discrimination of spatial and temporal information while significantly decreasing the computational cost. The proposed method consists of four stages. First, we generate an ordered subset of images from a gesture video, filtering out those that do not contribute to the recognition task. Second, we express spatial and temporal gesture information in a compact trajectory matrix. Then, we represent the obtained matrix as a subspace, achieving discriminative information, as the trajectory matrices derived from different gestures generate dissimilar clusters in a low dimension space. Finally, we apply soft weights to find the optimal dimension of each gesture subspace. We demonstrate practical and theoretical gains of our compact representation through experimental evaluation using two publicity available gesture datasets.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"37 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75278315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Unsupervised multiview learning with partial distribution information 具有部分分布信息的无监督多视图学习

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168138

Shashini De Silva, Jinsub Kim, R. Raich

引用次数: 0

Blind source separation for nonstationary tensor-valued time series 非平稳张量值时间序列的盲源分离

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168122

Joni Virta, K. Nordhausen

引用次数: 4

Neural network alternatives toconvolutive audio models for source separation 用于源分离的卷积音频模型的神经网络替代品

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168108

Shrikant Venkataramani, Cem Subakan, P. Smaragdis

引用次数: 14

Deep convolutional neural networks for interpretable analysis of EEG sleep stage scoring 基于深度卷积神经网络的脑电图睡眠阶段评分可解释性分析

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168133

A. Vilamala, Kristoffer Hougaard Madsen, L. K. Hansen

引用次数: 127

Deep divergence-based clustering 基于深度发散的聚类

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168158

Michael C. Kampffmeyer, Sigurd Løkse, F. Bianchi, L. Livi, A. Salberg, R. Jenssen

引用次数: 7

Visualizing and improving scattering networks 可视化和改进散射网络

2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP) Pub Date : 2017-09-01 DOI: 10.1109/MLSP.2017.8168136

Fergal Cotter, N. Kingsbury

{"title":"Visualizing and improving scattering networks","authors":"Fergal Cotter, N. Kingsbury","doi":"10.1109/MLSP.2017.8168136","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168136","url":null,"abstract":"Scattering Transforms (or ScatterNets) introduced by Mallat in [1] are a promising start into creating a well-defined feature extractor to use for pattern recognition and image classification tasks. They are of particular interest due to their architectural similarity to Convolutional Neural Networks (CNNs), while requiring no parameter learning and still performing very well (particularly in constrained classification tasks). In this paper we visualize what the deeper layers of a ScatterNet are sensitive to using a ‘DeScatterNet’. We show that the higher orders of ScatterNets are sensitive to complex, edge-like patterns (checker-boards and rippled edges). These complex patterns may be useful for texture classification, but are quite dissimilar from the patterns visualized in second and third layers of Convolutional Neural Networks (CNNs) — the current state of the art Image Classifiers. We propose that this may be the source of the current gaps in performance between ScatterNets and CNNs (83% vs 93% on CIFAR-10 for ScatterNet+SVM vs ResNet). We then use these visualization tools to propose possible enhancements to the ScatterNet design, which show they have the power to extract features more closely resembling CNNs, while still being well-defined and having the invariance properties fundamental to ScatterNets.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"59 4 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79763828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14