2010 IEEE International Conference on Acoustics, Speech and Signal Processing最新文献

筛选
英文 中文
Directed network inference using a measure of directed information 使用有向信息度量的有向网络推理
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495654
Y. Liu, Selin Aviyente
{"title":"Directed network inference using a measure of directed information","authors":"Y. Liu, Selin Aviyente","doi":"10.1109/ICASSP.2010.5495654","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495654","url":null,"abstract":"The concept of mutual information (MI) has been widely used for inferring complex networks such as genetic regulatory networks. However, the MI based methods cannot infer directed or dynamic networks. In this paper, we propose a new network inference algorithm to infer directed acyclic networks which can determine both the connectivity and causality between different nodes based on the concept of directed information (DI) and conditional directed information. The proposed method is applied to both simulated data and Electroencephalography (EEG) data to evaluate its effectiveness.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Distributed correlated Q-learning for dynamic transmission control of sensor networks 基于分布式相关q学习的传感器网络动态传输控制
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495265
Jane Wei Huang, Quanyan Zhu, V. Krishnamurthy, T. Başar
{"title":"Distributed correlated Q-learning for dynamic transmission control of sensor networks","authors":"Jane Wei Huang, Quanyan Zhu, V. Krishnamurthy, T. Başar","doi":"10.1109/ICASSP.2010.5495265","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495265","url":null,"abstract":"This paper considers a Markovian dynamical game theoretic setting for distributed transmission control in a wireless sensor network. The available spectrum bandwidth is modeled as a Markov chain. A distributed algorithm named correlated Q-learning algorithm is proposed to obtain the correlated equilibrium policies of the system. This algorithm has the decentralized feature and is easily implementable in a real system. Numerical example is also provided to verify the performances of the proposed algorithms.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123063528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models 基于子空间高斯混合模型的语音识别多语言声学建模
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495646
L. Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, O. Glembek, N. Goel, M. Karafiát, Daniel Povey, A. Rastrow, R. Rose, Samuel Thomas
{"title":"Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models","authors":"L. Burget, Petr Schwarz, Mohit Agarwal, Pinar Akyazi, Kai Feng, Arnab Ghoshal, O. Glembek, N. Goel, M. Karafiát, Daniel Povey, A. Rastrow, R. Rose, Samuel Thomas","doi":"10.1109/ICASSP.2010.5495646","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495646","url":null,"abstract":"Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of “universal phone set” that covers multiple languages. We report experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. We use a model called a “Subspace Gaussian Mixture Model” where states' distributions are Gaussian Mixture Models with a common structure, constrained to lie in a subspace of the total parameter space. The parameters that define this subspace can be shared across languages. We obtain substantial WER improvements with this approach, especially with very small amounts of in-language training data.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123101740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 189
A track before detect approach for sequential Bayesian tracking of multiple speech sources 一种多语音源序列贝叶斯跟踪的先跟踪后检测方法
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495092
Pasi Pertilä, M. Hämäläinen
{"title":"A track before detect approach for sequential Bayesian tracking of multiple speech sources","authors":"Pasi Pertilä, M. Hämäläinen","doi":"10.1109/ICASSP.2010.5495092","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495092","url":null,"abstract":"This paper describes a novel multiple acoustic source tracking method based on track before detect paradigm. Multiple particle filters are used to represent the state of all sources. Sources are detected and removed using a likelihood ratio obtained from particle weights. The weights are obtained by evaluating the likelihood of microphone pair phase difference. Tracking performance from recorded data with rich sequences of speech is presented using multiple object tracking metrics. Results show that the proposed method can detect and track multiple temporally overlapping speech sources as well as switching talkers even in weak signal-to-noise ratios.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121825128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Fast similarity search on a large speech data set with neighborhood graph indexing 基于邻域图索引的大型语音数据集快速相似度搜索
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5494950
K. Aoyama, Shinji Watanabe, H. Sawada, Yasuhiro Minami, N. Ueda, Kazumi Saito
{"title":"Fast similarity search on a large speech data set with neighborhood graph indexing","authors":"K. Aoyama, Shinji Watanabe, H. Sawada, Yasuhiro Minami, N. Ueda, Kazumi Saito","doi":"10.1109/ICASSP.2010.5494950","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5494950","url":null,"abstract":"This paper presents a novel graph-based approach for solving a problem of fast finding a speech model acoustically similar to a query model from a large set of speech models. Each speech model in the set is represented by a Gaussian mixture model and dissimilarity from a GMM to another is measured with a Kullback-Leibler divergence (KLD). Conventional pruning techniques based on the triangle inequality for fast similarity search are not available because the model space with a KLD is not a metric space. We propose a search method that is characterized by an index of a degree-reduced nearest neighbor (DRNN) graph. The search method can efficiently find the most similar (closest) GMM to a query, exploring the DRNN graph with a best-first manner. Experimental evaluations on utterance GMM search tasks reveal a significantly low computational cost of the proposed method.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115744872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A discriminative model for continuous speech recognition based on Weighted Finite State Transducers 基于加权有限状态换能器的连续语音识别判别模型
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495096
Shinji Watanabe, Takaaki Hori, E. McDermott, Atsushi Nakamura
{"title":"A discriminative model for continuous speech recognition based on Weighted Finite State Transducers","authors":"Shinji Watanabe, Takaaki Hori, E. McDermott, Atsushi Nakamura","doi":"10.1109/ICASSP.2010.5495096","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495096","url":null,"abstract":"This paper proposes a discriminative model for speech recognition that directly optimizes the parameters of a speech model represented in the form of a decoding graph. In the process of recognition, a decoder, given an input speech signal, searches for an appropriate label sequence among possible combinations from separate knowledge sources of speech, e.g., acoustic, lexicon, and language models. It is more reasonable to use an integrated knowledge source, which is composed of these models and forms an overall space to be searched by a decoder, than to use separate ones. This paper aims to estimate a speech model composed in this way directly in the search network, unlike discriminative training approaches, which estimate parameters in acoustic or language model layers. Our approach is formulated as the weight parameter optimization of log-linear distributions in the decoding arcs of a Weighted Finite State Transducer (WFST) to efficiently handle a large network statically. The weight parameters are estimated by an averaged perceptron algorithm. The experimental results show that, especially when the model size is small, the proposed approach provided better recognition performance than the conventional maximum likelihood and comparable to or slightly better performance than discriminative training approaches.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116798991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Leveraging speaker diarization for meeting recognition from distant microphones 利用扬声器拨号从远处的麦克风会议识别
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495626
A. Stolcke, G. Friedland, David Imseng
{"title":"Leveraging speaker diarization for meeting recognition from distant microphones","authors":"A. Stolcke, G. Friedland, David Imseng","doi":"10.1109/ICASSP.2010.5495626","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495626","url":null,"abstract":"We investigate using state-of-the-art speaker diarization output for speech recognition purposes. While it seems obvious that speech recognition could benefit from the output of speaker diarization (“Who spoke when”) for effective feature normalization and model adaptation, such benefits have remained elusive in the very challenging domain of meeting recognition from distant microphones. In this study, we show that recognition gains are possible by careful post-processing of the diarization output. Still, recognition accuracy may suffer when the underlying diarization system performs worse than expected, even compared to far less sophisticated speaker-clustering techniques. We obtain a more accurate and robust overall system by combining recognition output with multiple speaker segmentations and clusterings. We evaluate our methods on data from the 2009 NIST Rich Transcription meeting recognition evaluation.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116900521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Optimized intrinsic dimension estimator using nearest neighbor graphs 优化的内维估计使用最近邻图
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5494931
K. Sricharan, R. Raich, A. Hero
{"title":"Optimized intrinsic dimension estimator using nearest neighbor graphs","authors":"K. Sricharan, R. Raich, A. Hero","doi":"10.1109/ICASSP.2010.5494931","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5494931","url":null,"abstract":"We develop an approach to intrinsic dimension estimation based on k-nearest neighbor (kNN) distances. The dimension estimator is derived using a general theory on functionals of kNN density estimates. This enables us to predict the performance of the dimension estimation algorithm. In addition, it allows for optimization of free parameters in the algorithm. We validate our theory through simulations and compare our estimator to previous kNN based dimensionality estimation approaches.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116916499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Automatic state discovery for unstructured audio scene classification 用于非结构化音频场景分类的自动状态发现
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495605
Julian Ramos, S. Siddiqi, A. Dubrawski, Geoffrey J. Gordon, Abhishek Sharma
{"title":"Automatic state discovery for unstructured audio scene classification","authors":"Julian Ramos, S. Siddiqi, A. Dubrawski, Geoffrey J. Gordon, Abhishek Sharma","doi":"10.1109/ICASSP.2010.5495605","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495605","url":null,"abstract":"In this paper we present a novel scheme for unstructured audio scene classification that possesses three highly desirable and powerful features: autonomy, scalability, and robustness. Our scheme is based on our recently introduced machine learning algorithm called Simultaneous Temporal And Contextual Splitting (STACS) that discovers the appropriate number of states and efficiently learns accurate Hidden Markov Model (HMM) parameters for the given data. STACS-based algorithms train HMMs up to five times faster than Baum-Welch, avoid the overfitting problem commonly encountered in learning large state-space HMMs using Expectation Maximization (EM) methods such as Baum-Welch, and achieve superior classification results on a very diverse dataset with minimal pre-processing. Furthermore, our scheme has proven to be highly effective for building real-world applications and has been integrated into a commercial surveillance system as an event detection component.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116967021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
2-D two-fold symmetric circular shaped filter design with homomorphic processing application 二维双重对称圆形滤波器设计与同态处理应用
2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI: 10.1109/ICASSP.2010.5495877
A. J. Seneviratne, H. H. Kha, H. Tuan, Truong Q. Nguyen
{"title":"2-D two-fold symmetric circular shaped filter design with homomorphic processing application","authors":"A. J. Seneviratne, H. H. Kha, H. Tuan, Truong Q. Nguyen","doi":"10.1109/ICASSP.2010.5495877","DOIUrl":"https://doi.org/10.1109/ICASSP.2010.5495877","url":null,"abstract":"A design method of a linear-phased, two-dimensional (2-D), two-fold symmetric circular shaped filter is presented in this paper. Although the proposed method designs a non-separable filter, its implementation has linear complexity. The shape of the passband and the stopband is expressed in terms of level sets of second order trigonometric polynomials. This enables the transformation of the filter specifications to a Semi-Definite Program (SDP) of moderate dimension. The proposed filter outperforms currently available filter design methods. We present a performance comparison, as well as a homomorphic processing image enhancement example to illustrate the effectiveness of this method.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121074441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信