2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献_第8页

Visual Saliency Detection Algorithm in Compressed HEVC Domain 压缩HEVC域的视觉显著性检测算法

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659565

Rui Bai, Wei Zhou, Guanwen Zhang, Henglu Wei

引用次数: 4

Active Speech Obscuration with Speaker-dependent Human Speech-like Noise for Speech Privacy 基于说话人依赖的类人语音噪声的主动语音模糊

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659754

Yoshitaka Ohshio, Haruka Adachi, Kenta Iwai, T. Nishiura, Y. Yamashita

{"title":"Active Speech Obscuration with Speaker-dependent Human Speech-like Noise for Speech Privacy","authors":"Yoshitaka Ohshio, Haruka Adachi, Kenta Iwai, T. Nishiura, Y. Yamashita","doi":"10.23919/APSIPA.2018.8659754","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659754","url":null,"abstract":"This paper introduces a new active speech obscuration with speaker-dependent human speech-like noise (HSLN) for speech privacy. Recently, speech privacy is regarded as an important issue in open public spaces such as hospitals, pharmacies, banks, and so on. To protect speech privacy, speech obscuration methods utilizing HSLN have been studied. HSLNs are designed by superposing various speech signals and speech obscuration is achieved by hearing the target speech and HSLN at the same time. Conventionally, HSLN is designed with the pitch of the target speech as the sole speaker-dependent characteristic. However, additional speaker-dependent characteristics are required because the performance of speech obscuration is still insufficient. Therefore, we propose a speaker-dependent HSLN design method for effective speech obscuration that uses the third formant frequency of the target speech in addition to pitch as speaker-dependent characteristics. The third formant frequency is related to voice quality, which depends on the shape and length of the vocal tract. It follows that the proposed method can effectively mask the target speech by the HSLN considering the pitch and third formant frequency, which are analyzed from the speech. Experimental results demonstrate the effectiveness of the proposed method.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116675977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Wavelet Scattering Transform for Variability Reduction in Cortical Potentials Evoked by Pitch Matched Electro-acoustic Stimulation in Unilateral Cochlear Implant Patients 小波散射变换对单侧人工耳蜗基音匹配电声刺激诱发皮层电位变异性的影响

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659659

M. Heydarzadeh, Sara Akbarzadeh, Chin-Tuan Tan

{"title":"Wavelet Scattering Transform for Variability Reduction in Cortical Potentials Evoked by Pitch Matched Electro-acoustic Stimulation in Unilateral Cochlear Implant Patients","authors":"M. Heydarzadeh, Sara Akbarzadeh, Chin-Tuan Tan","doi":"10.23919/APSIPA.2018.8659659","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659659","url":null,"abstract":"Cochlear implant (CI) restores the hearing sensation in profoundly deafen patients by directly stimulating auditory nerve with electric pulses using an array of tonotopically inserted electrodes. Basal electrodes stimulate in response to high input frequencies while apical electrodes stimulate to low input frequencies. The problem with this electrical stimulation, particularly in unilaterally implanted users who has residual hearing in the contra-lateral ear, lies in the frequency mismatch between characteristic frequency of auditory nerve and input signal. In this paper, we revisit our previously proposed mechanism for tuning intra-cochlear electrode to its pitch matched frequency using a single channel EEG [1]. We apply the wavelet scattering transform to extract a deformation invariant from the EEG signal recorded from each of 10 CI subjects when they were listening to pitch matched electro-acoustic stimulation. Results show that the wavelet scattering transform is able to capture the variability introduced by different subjects, and a more robust alternative to reveal the underlying neuro-physiological responses to this perceptual event.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115734372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reward Only Training of Encoder-Decoder Digit Recognition Systems Based on Policy Gradient Methods 基于策略梯度方法的编码器-解码器数字识别系统的奖励训练

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659527

Yilong Peng, Hayato Shibata, T. Shinozaki

{"title":"Reward Only Training of Encoder-Decoder Digit Recognition Systems Based on Policy Gradient Methods","authors":"Yilong Peng, Hayato Shibata, T. Shinozaki","doi":"10.23919/APSIPA.2018.8659527","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659527","url":null,"abstract":"Zero resource speech recognition is getting attention for engineering as well as scientific purposes. Based on the existing unsupervised learning frameworks using only speech input, however, it is impossible to associate automatically found linguistic units with spellings and concepts. In this paper, we propose an approach that uses a scalar reward that is assumed to be given for each decoding result of an utterance. While the approach is straightforward using reinforcement learning, the difficulty is to obtain a convergence without the help of supervised learning. Focusing on encoder-decoder based speech recognition, we explore several neural network architectures, optimization methods, and reward definitions, seeking a suitable configuration for policy gradient reinforcement learning. Experiments were performed using connected digit utterances from the TIDIGITS corpus as training and evaluation sets. While it is challenging, we show that learning a connected digit recognition system is possible achieving 13.6% of digit error rate. The success largely depends on the configurations and we reveal the appropriate condition that is largely different from supervised training.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116233046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Alternating Binary Classifier and Graph Learning from Partial Labels 交替二元分类器与部分标签图学习

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659448

Cheng Yang, Gene Cheung, V. Stanković

{"title":"Alternating Binary Classifier and Graph Learning from Partial Labels","authors":"Cheng Yang, Gene Cheung, V. Stanković","doi":"10.23919/APSIPA.2018.8659448","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659448","url":null,"abstract":"Semi-supervised binary classifier learning is a fundamental machine learning task where only partial binary labels are observed, and labels of the remaining data need to be interpolated. Leveraging on the advances of graph signal processing (GSP), recently binary classifier learning is posed as a signal restoration problem regularized using a graph smoothness prior, where the undirected graph consists of a set of vertices and a set of weighted edges connecting vertices with similar features. In this paper, we improve the performance of such a graph-based classifier by simultaneously optimizing the feature weights used in the construction of the similarity graph. Specifically, we start by interpolating missing labels by first formulating a boolean quadratic program with a graph signal smoothness objective, then relax it to a convex semi-definite program, solvable in polynomial time. Next, we optimize the feature weights used for construction of the similarity graph by reusing the smoothness objective but with a convex set constraint for the weight vector. The reposed convex but non-differentiable problem is solved via an iterative proximal gradient descent algorithm. The two steps are solved alternately until convergence. Experimental results show that our alternating classifier / graph learning algorithm outperforms existing graph-based methods and support vector machines with various kernels1The work is partly funded by the European Unions Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 734331..","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114988156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Unsupervised Singing Voice Separation Using Gammatone Auditory Filterbank and Constraint Robust Principal Component Analysis 基于伽玛酮听觉滤波组和约束鲁棒主成分分析的无监督歌声分离

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659640

Feng Li, M. Akagi

引用次数: 2

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition 基于多线性张量环分解的有效张量补全方法

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659492

Jinshi Yu, Guoxu Zhou, Qibin Zhao, Kan Xie

引用次数: 12

Microphone Position Realignment by Extrapolation of Virtual Microphone 基于虚拟麦克风外推的麦克风位置调整

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659728

R. Jinzai, K. Yamaoka, Mitsuo Matsumoto, Takeshi Yamada, S. Makino

{"title":"Microphone Position Realignment by Extrapolation of Virtual Microphone","authors":"R. Jinzai, K. Yamaoka, Mitsuo Matsumoto, Takeshi Yamada, S. Makino","doi":"10.23919/APSIPA.2018.8659728","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659728","url":null,"abstract":"In this paper, microphone realignment by phase extrapolation using the virtual microphone technique for reproducing binaural signals with adequate the interaural time differences (ITDs) for a listener is proposed. For a sound source in the horizontal plane, ITDs are a major cues for localizing a sound image. Since ITDs are not considered for headphones listening in conventional amplitude panning in multichannel recording, sound images are localized inside the head (lateralization). A microphone array is applicable to recording signals with time differences corresponding to the directions of sound sources. Since microphones in such an array are closely positioned, the time differences are inappropriate as ITDs for localizing sound images for the sources. In this paper, phase extrapolation using the virtual microphone technique is applied to the virtual realignment of a microphone in such an array for restoring ITD. In the experiments with two speeches as sound sources located at the leftmost and the rightmost positions from the viewpoint of two real microphones positioned 2.83 cm apart. Furthermore, the phase of a signal of a virtual realigned microphone is extrapolated eight times as much as the phase between the two real microphones. Time differences between signals of one of the real microphones and the realigned one are observed to be $-500 boldsymbol{mu}mathbf{s}$ for the source on the left and $500 boldsymbol{mu}mathbf{s}$ for the source on the right. Furthermore, the interaural cross correlations of the two signals suggest that sound images will be perceived on both the left and right of a listener. In this method, it is expected that prior information on the number of sources and the direction of arrival is not required, and the adjustment of individual differences is easy.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123046519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Lung Disease Classification Based on Feature Fusion Convolutional Neural Network with X-ray Image Enhancement 基于x射线图像增强特征融合卷积神经网络的肺部疾病分类

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659700

Yue Cheng, Jinchao Feng, Ke-bin Jia

引用次数: 8

Privacy-Preserving SVM Computing in the Encrypted Domain 加密域的隐私保护SVM计算

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659529

Takahiro Maekawa, Ayana Kawamura, Yuma Kinoshita, H. Kiya

引用次数: 14