2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献

筛选
英文 中文
Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling 基于潜在轨迹建模的声学-发音深度反演映射
Patrick Lumban Tobing, H. Kameoka, T. Toda
{"title":"Deep acoustic-to-articulatory inversion mapping with latent trajectory modeling","authors":"Patrick Lumban Tobing, H. Kameoka, T. Toda","doi":"10.1109/APSIPA.2017.8282219","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282219","url":null,"abstract":"This paper presents a novel implementation of latent trajectory modeling in a deep acoustic-to-articulatory inversion mapping framework. In the conventional methods, i.e., the Gaussian mixture model (GMM)- and the deep neural network (DNN)- based inversion mappings, the frame interdependency can be considered while generating articulatory parameter trajectories with the use of an explicit constraint between static and dynamic features. However, in training these models, such a constraint is not considered, and therefore, the trained model is not optimum for the mapping procedure. In this paper, we address this problem by introducing a latent trajectory modeling into the DNN-based inversion mapping. In the latent trajectory model, the frame interdependency can be well considered, in both training and mapping, by using a soft-constraint between static and dynamic features. The experimental results demonstrate that the proposed latent trajectory DNN (LTDNN)-based inversion mapping outperforms the conventional and the state-of-the-art inversion mapping systems.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117268618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Hybrid EEG-NIRS brain-computer interface under eyes-closed condition 闭眼条件下脑电-近红外混合脑机接口
Jaeyoung Shin, K. Müller, Han-Jeong Hwang
{"title":"Hybrid EEG-NIRS brain-computer interface under eyes-closed condition","authors":"Jaeyoung Shin, K. Müller, Han-Jeong Hwang","doi":"10.1109/APSIPA.2017.8282127","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282127","url":null,"abstract":"In this study, we propose a hybrid BCI combining electroencephalography (EEG) and near-infrared spectroscopy (NIRS) that can be potentially operated in eyes-closed condition for paralyzed patients with oculomotor dysfunctions. In the experiment, seven healthy participants performed mental subtraction and stayed relaxed (baseline state), during which EEG and NIRS data were simultaneously measured. To evaluate the feasibility of the hybrid BCI, we classified frontal brain activities inducted by mental subtraction and baseline state, and compared classification accuracies obtained using unimodal EEG and NIRS BCI and the hybrid BCI. As a result, the hybrid BCI (85.54 % ± 8.59) showed significantly higher classification accuracy than those of unimodal EEG (80.77 % ± 11.15) and NIRS BCI (77.12 % ± 7.63) (Wilcoxon signed rank test, Bonferroni corrected p < 0.05). The result demonstrated that our eyes-closed hybrid BCI approach could be potentially applied to neurodegenerative patients with impaired motor functions accompanied by a decline of visual functions.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116137384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Four-dimensional image compression with region of interest based on non-separable double lifting integer wavelet transform 基于不可分双提升整数小波变换的感兴趣区域四维图像压缩
Fairoza Amira Hamzah, Taichi Yoshida, M. Iwahashi
{"title":"Four-dimensional image compression with region of interest based on non-separable double lifting integer wavelet transform","authors":"Fairoza Amira Hamzah, Taichi Yoshida, M. Iwahashi","doi":"10.1109/APSIPA.2017.8282329","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282329","url":null,"abstract":"This paper increases the coding performance for four-dimensional (4D) image based on the region of interest (ROI) coding implemented in the non-separable double lifting structure of 4D integer wavelet transform (WT). The WT has succeeded its predecessor, the discrete cosine transform (DCT), which has been widely used in image compression international standard, the JPEG 2000 since more than a decade ago. The conventional lifting structure which is known as the separable structure has many rounding operators that will increase the rounding noise inside the transform. The higher the rounding noise inside the transform, the lower the coding performance. Thus, a non-separable structure of double lifting WT is introduced to reduce the rounding noise. The non-separable structure is compatible with the conventional wavelet-based JPEG 2000. Furthermore, an ROI coding based non-separable integer WT is proposed by utilizing both lossy and lossless compression and it was observed that the proposed method increased the coding performance of 4D image.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"319 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123330905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Investigating the use of scattering coefficients for replay attack detection 研究使用散射系数重放攻击检测
Kaavya Sriskandaraja, Gajan Suthokumar, V. Sethu, E. Ambikairajah
{"title":"Investigating the use of scattering coefficients for replay attack detection","authors":"Kaavya Sriskandaraja, Gajan Suthokumar, V. Sethu, E. Ambikairajah","doi":"10.1109/APSIPA.2017.8282211","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282211","url":null,"abstract":"Widespread adoption of speaker verification for security relies on the existence of effective anti-spoofing countermeasures. This paper presents a countermeasure based on spectral features to detect replay spoofing attacks on automatic speaker verification systems. In particular, the use of hierarchical scattering decomposition coefficients and inverse- mel frequency cepstral coefficients are explored. Our best system achieved a relative improvement of around 70% in terms of equal error rate on the development set and 20% on the evaluation set, when compared to the baseline on the ASVspoof 2017 database. In addition, we show that features with a shorter window can be beneficial to detecting replayed speech, in contrast to speech synthesis and voice conversion attack.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123064711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Efficient edge-oriented based image interpolation algorithm for non-integer scaling factor 基于边缘的非整数比例因子图像插值算法
Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee
{"title":"Efficient edge-oriented based image interpolation algorithm for non-integer scaling factor","authors":"Chia-Chun Hsu, Jian-Jiun Ding, Yih-Cherng Lee","doi":"10.1109/APSIPA.2017.8282202","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282202","url":null,"abstract":"Though image interpolation has been developed for many years, most of state-of-the-art methods, including machine learning based methods, can only zoom the image with the scaling factor of 2, 3, 2k, or other integer values. Hence, the bicubic interpolation method is still a popular method for the non-integer scaling problem. In this paper, we propose a novel interpolation algorithm for image zooming with non-integer scaling factors based on the gradient direction. The proposed method first estimates the gradient direction for each pixel in the low resolution image. Then, we construct the gradient map for the high resolution image by the spline interpolation method. Finally, the intensity of missing pixels can be computed by the weighted sum of the pixels in the pre-defined window. To preserve the edge information during the interpolation process, the weight is determined by the inner product of the estimated gradient vector and the vector from the missing pixel to the known data point. Simulations show that the proposed method has higher performance than other non-integer time scaling methods and is helpful for superresolution.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"63 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123187549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting imbalanced textual and acoustic data for training prosodically-enhanced RNNLMs 利用不平衡文本和声学数据训练韵律增强rnnlm
Michael Hentschel, A. Ogawa, Marc Delcroix, T. Nakatani, Yuji Matsumoto
{"title":"Exploiting imbalanced textual and acoustic data for training prosodically-enhanced RNNLMs","authors":"Michael Hentschel, A. Ogawa, Marc Delcroix, T. Nakatani, Yuji Matsumoto","doi":"10.1109/APSIPA.2017.8282099","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282099","url":null,"abstract":"There have been many attempts in the past to exploit various sources of information in language modelling besides words, for instance prosody or topic information. With neural network based language models, it became easier to make use of this continuous valued information, because the neural network transforms the discrete valued space into a continuous valued space. So far, models incorporating prosodic information were jointly trained on the auxiliary and the textual information from the beginning. However, in practice the auxiliary information is usually only available for a small amount of the training data. In order to fully exploit text and acoustic data, we propose to re-train a recurrent neural network language model, rather than training a language model from scratch. Using this method we achieved perplexity and word error rate reductions for N-best rescoring on the MIT-OCW lecture corpus.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126123674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A real time micro-expression detection system with LBP-TOP on a many-core processor 基于LBP-TOP的多核微表情实时检测系统
X. Soh, Vishnu Monn Baskaran, Adamu Muhammad Buhari, R. Phan
{"title":"A real time micro-expression detection system with LBP-TOP on a many-core processor","authors":"X. Soh, Vishnu Monn Baskaran, Adamu Muhammad Buhari, R. Phan","doi":"10.1109/APSIPA.2017.8282041","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282041","url":null,"abstract":"The implementation of a micro-expression detection system introduces challenges to sustain a real time recognition result. In order to surmount these problems, this paper examines the algorithm of a serial Local Binary Pattern from Three Orthogonal Planes (LBP-TOP) in order to identify the performance limitations for real time system. Videos from SMIC and CASMEII were up sampled to higher resolutions (280×340, 560×680 and 1120×1360) to cater the need of real life implementation. Then, a parallel multicore-based LBP-TOP algorithm is studied as a benchmark. Experimental results show that the parallel LBP-TOP algorithm exhibits 7× and 8× speedup against serial LBP-TOP for SMIC and CASMEII database respectively for the highest tested video resolution utilising 24- logical processor multi-core architecture. To further reduce the computational time, this paper also proposes a many-core parallel LBP-TOP algorithm using Compute Unified Device Architecture (CUDA). In addition, a method is designed to calculate the threads and blocks required to launch the kernel when processing videos from different resolutions. The proposed algorithm increases the performance speedup to 117× and 130× against the serial algorithm for the highest tested resolution videos.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126129679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Prediction techniques for wavelet based 1-D signal compression 基于小波变换的一维信号压缩预测技术
I-Hsiang Wang, Jian-Jiun Ding, H. Hsu
{"title":"Prediction techniques for wavelet based 1-D signal compression","authors":"I-Hsiang Wang, Jian-Jiun Ding, H. Hsu","doi":"10.1109/APSIPA.2017.8281996","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8281996","url":null,"abstract":"This paper proposes a novel one-dimensional (1-D) signal compression technique. We first perform beat-alignment to transform a 1-D signal into 2-D, then use 2-D discrete wavelet transform (DWT) to further decompose the 2-D signal into multiple subbands. These coefficients in certain subbands are then coded using a simple differential pulse code modulation (DPCM). After which, we construct neural networks one for each subband (except the LL subband) to perform prediction. Based on the prediction results, we construct a type of pixel-wise context A to determine the activity of a given pixel. At last, the DWT coefficients and residues from DPCM are bit-plane coded using the Embedded Block Coding with Optimized Truncation (EBCOT) from JPEG2000. We analyzed our results using a well- known 1D signal, the ECG signals in the MIT-BIH database, and it demonstrated significant improvement over existing methods.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129647205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A rail detection algorithm based on pair particles filtering 一种基于对粒子滤波的轨道检测算法
Ji-Sang Bae, Jong-Ok Kim
{"title":"A rail detection algorithm based on pair particles filtering","authors":"Ji-Sang Bae, Jong-Ok Kim","doi":"10.1109/APSIPA.2017.8282258","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282258","url":null,"abstract":"Safety of mass transportation like train cannot be emphasized enough, and accurate rail detection in the direction of progress can be useful to the safe operation of a train. In this paper, we propose a new pair particles filtering based rail detection algorithm that simultaneously predicts a pair position of left and right rails. Multiple pairs of particles are first generated from the previously detected rails, and features of a pair particles position, rail gauge, and gradient magnitude are used to detect the positions of pair rails. The proposed pair particles filtering based method flexibly detects both straight and curved rails robustly. Experiments with various actual rail images show plausible detection results of the proposed method.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129744707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Epileptic focus localization based on bivariate empirical mode decomposition and entropy 基于二元经验模态分解和熵的癫痫病灶定位
Tatsunori Itakura, Toshihisa Tanaka
{"title":"Epileptic focus localization based on bivariate empirical mode decomposition and entropy","authors":"Tatsunori Itakura, Toshihisa Tanaka","doi":"10.1109/APSIPA.2017.8282255","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282255","url":null,"abstract":"Epilepsy is a neurological disorder which causes abnormal discharges in the brain. Epileptic focus localization is a important factor for successful epilepsy surgery. The intracranial electroencephalogram (iEEG) is the most used signal for detecting epileptic focus. The iEEG signals are obtained from a publicly available database that consists of 7,500 signal pairs. To this dataset, empirical mode decomposition (EMD) has been successfully applied to detect the epileptic focus. However, the EMD method is not suitable for iEEG signal pairs. In this paper, a method for the classification of focal and non-focal iEEG signals using bivariate EMD (BEMD) is presented. The bivariate iEEG signals are decomposed the into signal components of the same frequency band. Various entropy measures calculated from the IMFs of the iEEG signals. Then, some or all of the entropies are chosen as features, which are discriminated into focal or non-focal iEEG by using the support vector machine (SVM). Experimental results show that the proposed method is able to differentiate the focal from non-focal iEEG signals with an average classification accuracy of 86.89%.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129410428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信