2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献_第8页

PGT: Proposal-guided object tracking PGT:提议引导的对象跟踪

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282318

Han-Ul Kim, Chang-Su Kim

引用次数: 0

A drag-and-drop type human computer interaction technique based on electrooculogram 一种基于眼电图的拖放式人机交互技术

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282126

S. Ogai, Toshihisa Tanaka

引用次数: 1

Sound source localization using binaural difference for hose-shaped rescue robot 基于双耳差分的软管型救援机器人声源定位

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282292

Narumi Mae, Yoshiki Mitsui, S. Makino, Daichi Kitamura, Nobutaka Ono, Takeshi Yamada, H. Saruwatari

{"title":"Sound source localization using binaural difference for hose-shaped rescue robot","authors":"Narumi Mae, Yoshiki Mitsui, S. Makino, Daichi Kitamura, Nobutaka Ono, Takeshi Yamada, H. Saruwatari","doi":"10.1109/APSIPA.2017.8282292","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282292","url":null,"abstract":"Rescue robots have been developed for search and rescue operations in times of large-scale disasters. Such a robot is used to search for survivors in disaster sites by capturing their voices with its microphone array. However, since the robot has many vibration motors, ego noise is mixed with voices, and it is difficult to differentiate the ego noise from a call for help from a disaster survivor. In our previous works, an ego noise reduction technique that combines a method of blind source separation called independent low-rank matrix analysis and postprocessing for noise cancellation was proposed. In the practical use of this robot, to determine the precise location of survivors, the direction of the observed voice should be estimated after the ego noise reduction process. To achieve this objective, in this study, a new hose-shaped rescue robot with microphone arrays was developed. Moreover, we adapt postfilter called MOSIE to our previous noise reduction method to listen to stereo sound because this robot can record stereo sound. By performing in a simulated disaster site, we confirm that the operator can perceive the direction of a survivor's location by applying a speech enhancement technique combining independent low-rank matrix analysis, noise cancellation, and postfiltering to the observed multichannel noisy signals.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121112812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Real-time digitized neural-spike storage scheme in multiple channels for biomedical applications 生物医学应用中多通道实时数字化神经脉冲存储方案

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282256

Anand Kumar Mukhopadhyay, I. Chakrabarti, M. Sharad

引用次数: 1

An investigation to transplant emotional expressions in DNN-based TTS synthesis 基于dnn的TTS合成中情感表达移植的研究

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282231

Katsuki Inoue, Sunao Hara, M. Abe, Nobukatsu Hojo, Yusuke Ijima

{"title":"An investigation to transplant emotional expressions in DNN-based TTS synthesis","authors":"Katsuki Inoue, Sunao Hara, M. Abe, Nobukatsu Hojo, Yusuke Ijima","doi":"10.1109/APSIPA.2017.8282231","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282231","url":null,"abstract":"In this paper, we investigate deep neural network (DNN) architectures to transplant emotional expressions to improve the expressiveness of DNN-based text-to-speech (TTS) synthesis. DNN is expected to have potential power in mapping between linguistic information and acoustic features. From multispeaker and/or multi-language perspectives, several types of DNN architecture have been proposed and have shown good performances. We tried to expand the idea to transplant emotion, constructing shared emotion-dependent mappings. The following three types of DNN architecture are examined; (1) the parallel model (PM) with an output layer consisting of both speaker- dependent layers and emotion-dependent layers, (2) the serial model (SM) with an output layer consisting of emotion-dependent layers preceded by speaker-dependent hidden layers, (3) the auxiliary input model (AIM) with an input layer consisting of emotion and speaker IDs as well as linguistics feature vectors. The DNNs were trained using neutral speech uttered by 24 speakers, and sad speech and joyful speech uttered by 3 speakers from those 24 speakers. In terms of unseen emotional synthesis, subjective evaluation tests showed that the PM performs much better than the SM and slightly better than the AIM. In addition, this test showed that the SM is the best of the three models when training data includes emotional speech uttered by the target speaker.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125928043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Blind speaker counting in highly reverberant environments by clustering coherence features 基于聚类相干特性的高混响环境下盲说话人计数

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282303

Shahab Pasha, Jacob Donley, C. Ritz

引用次数: 8

Joint estimation of signal and mutual coupling parameters based on spatially spread polarization sensitive array 基于空间扩频极化敏感阵列的信号和互耦参数联合估计

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282156

Huiyong Li, Zihui Luo, Julan Xie, Jun Li

引用次数: 2

Development of a multi-modal personal authentication interface 开发一个多模态的个人认证接口

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282125

Sung-Phil Kim, Jae-Hwan Kang, Y. Jo, Ian Oakley

引用次数: 1

Online sound structure analysis based on generative model of acoustic feature sequences 基于声特征序列生成模型的在线声结构分析

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282236

Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita

{"title":"Online sound structure analysis based on generative model of acoustic feature sequences","authors":"Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita","doi":"10.1109/APSIPA.2017.8282236","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282236","url":null,"abstract":"We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132423297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding multiple-input multiple-output active noise control from a perspective of sampling and reconstruction 从采样和重构的角度理解多输入多输出主动噪声控制

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282013

Chuang Shi, Huiyong Li, Dongyuan Shi, Bhan Lam, W. Gan

引用次数: 13