2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献_第3页

Image super-resolution based on error compensation with convolutional neural network 基于卷积神经网络误差补偿的图像超分辨率

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282203

Wei-Ting Lu, Chien-Wei Lin, Chih-Hung Kuo, Ying-Chan Tung

引用次数: 4

Importance of non-uniform prosody modification for speech recognition in emotion conditions 非均匀韵律修饰在情绪条件下语音识别中的重要性

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282109

Vishnu Vidyadhara Raju Vegesna, Hari Krishna Vydana, S. Gangashetty, A. Vuppala

{"title":"Importance of non-uniform prosody modification for speech recognition in emotion conditions","authors":"Vishnu Vidyadhara Raju Vegesna, Hari Krishna Vydana, S. Gangashetty, A. Vuppala","doi":"10.1109/APSIPA.2017.8282109","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282109","url":null,"abstract":"A mismatch in training and operating environments causes a performance degradation in speech recognition systems (ASR). One major reason for this mismatch is due to the presence of expressive (emotive) speech in operational environments. Emotions in speech majorly inflict the changes in the prosody parameters of pitch, duration and energy. This work is aimed at improving the performance of speech recognition systems in the presence of emotive speech. This work focuses on improving the speech recognition performance without disturbing the existing ASR system. The prosody modification of pitch, duration and energy is achieved by tuning the modification factors values for the relative differences between the neutral and emotional data sets. The neutral version of emotive speech is generated using uniform and non-uniform prosody modification methods for speech recognition. During the study, IITKGP-SESC corpus is used for building the ASR system. The speech recognition system for the emotions (anger, happy and compassion) is evaluated. An improvement in the performance of ASR is observed when the prosody modified emotive utterance is used for speech recognition in place of original emotive utterance. An average improvement around 5% in accuracy is observed due to the use of non-uniform prosody modification methods.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126837194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

A deep learning architecture for classifying medical images of anatomy object 一种用于解剖对象医学图像分类的深度学习架构

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282299

S. Khan, S. Yong

引用次数: 41

MSE-optimized CP-based CFO estimation in OFDM systems over multipath channels 多径信道OFDM系统中基于mse优化cp的CFO估计

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282146

Tzu-Chiao Lin, See-May Phoong

引用次数: 2

Electrolaryngeal speech modification towards singing aid system for laryngectomees 对喉切除者助唱系统的电喉语音改造

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282097

Kazuho Morikawa, T. Toda

{"title":"Electrolaryngeal speech modification towards singing aid system for laryngectomees","authors":"Kazuho Morikawa, T. Toda","doi":"10.1109/APSIPA.2017.8282097","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282097","url":null,"abstract":"Towards the development of a singing aid system for laryngectomees, we propose a method for converting electro-laryngeal (EL) speech produced by using an electrolarynx into more naturally sounding singing voices. Singing by using the electrolarynx is less flexible because the pitch of EL speech is determined by the source excitation signal mechanically produced by the electrolarynx, and therefore, it is necessary to embed melodies of songs to be sung in advance to the electrolarynx. In addition, sound quality of singing voices produced by the electrolarynx is severely degraded by an adverse effect of its mechanical excitation sounds emitted outside as noise. To address these problems, the proposed conversion method uses 1) pitch control by playing a musical instrument and 2) noise suppression. In the pitch control, pitch patterns of music sounds played simultaneously in singing with the electrolaryx are modified so that they have specific characteristics usually observed in singing voices, and then, the modified pitch patterns are used as the target pitch patterns in the conversion from EL speech into singing voices. In the noise suppression, spectral subtraction is used to suppress the leaked excitation sounds. The experimental results demonstrate that 1) naturalness of singing voices is significantly improved by the noise suppression and 2) the pitch pattern modification is not necessarily effective in the conversion from EL speech into singing voices.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114162684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Sliced voxel representations with LSTM and CNN for 3D shape recognition 使用LSTM和CNN进行三维形状识别的切片体素表示

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282044

R. Miyagi, Masaki Aono

引用次数: 4

Speech emotion recognition using convolutional long short-term memory neural network and support vector machines 基于卷积长短期记忆神经网络和支持向量机的语音情感识别

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282315

Nattapong Kurpukdee, Tomoki Koriyama, Takao Kobayashi, S. Kasuriya, C. Wutiwiwatchai, P. Lamsrichan

引用次数: 21

Nonuniform sampling theorems for random signals in the offset linear canonical transform domain 偏置线性正则变换域中随机信号的非均匀采样定理

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282008

Y. Bao, Yan-Na Zhang, Yu-E. Song, Bingzhao Li, P. Dang

引用次数: 3

A new pool control method for Boolean compressed sensing based adaptive group testing 一种新的基于布尔压缩感知的自适应群测试池控制方法

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282168

Yujia Lu, K. Hayashi

{"title":"A new pool control method for Boolean compressed sensing based adaptive group testing","authors":"Yujia Lu, K. Hayashi","doi":"10.1109/APSIPA.2017.8282168","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282168","url":null,"abstract":"In the adaptive group testing, the pool (a set of items to be tested) used in the next test is determined based on past test results, and its performance heavily depends on the control method of the pool. This paper proposes a new pool control method for Boolean compressed sensing based adaptive group testing. The proposed method firstly selects a pool size of the next test by minimizing the expectation of the approximated required number of tests after the next test based on the estimated number of remaining positive items. Then, when the selected pool size is one, an item having the highest probability of being positive will be selected as a pool, otherwise a pool with the selected size will be constructed by randomly selecting items. In addition, a new cardinality estimation method of positive items, that can be implemented in parallel with the proposed pool control method, is also proposed. Computer simulation results reveal that the adaptive group testing with the proposed method has better performance than that with the conventional methods for both with and without the information of cardinality of positive items.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125476602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improving N-gram language modeling for code-switching speech recognition 改进的N-gram语言建模用于代码转换语音识别

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282279

Zhiping Zeng, Haihua Xu, Tze Yuang Chong, Chng Eng Siong, Haizhou Li

{"title":"Improving N-gram language modeling for code-switching speech recognition","authors":"Zhiping Zeng, Haihua Xu, Tze Yuang Chong, Chng Eng Siong, Haizhou Li","doi":"10.1109/APSIPA.2017.8282279","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282279","url":null,"abstract":"Code-switching language modeling is challenging due to statistics of each individual language, as well as statistics of cross-lingual language are insufficient. To compensate for the issue of statistical insufficiency, in this paper we propose a word-class n-gram language modeling approach of which only infrequent words are clustered while most frequent words are treated as singleton classes themselves. We first demonstrate the effectiveness of the proposed method on our English-Mandarin code-switching SEAME data in terms of perplexity. Compared with the conventional word n-gram language models, as well as the word-class n-gram language models of which entire vocabulary words are clustered, the proposed word-class n- gram language modeling approach can yield lower perplexity on our SEAME dev data sets. Additionally, we observed further perplexity reduction by interpolating the word n-gram language models with the proposed word-class n-gram language models. We also attempted to build word-class n-gram language models using third-party text data with our proposed method, and similar perplexity performance improvement was obtained on our SEAME dev data sets when they are interpolated with the word n-gram language models. Finally, to examine the contribution of the proposed language modeling approach to code-switching speech recognition, we conducted lattice based n-best rescoring.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132007961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13