2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献_第5页

Perceptual evaluation of singing quality 对歌唱质量的感性评价

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282110

Chitralekha Gupta, Haizhou Li, Ye Wang

引用次数: 28

Word level prosody prediction using large audiobook dataset 使用大型有声读物数据集的词级韵律预测

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282218

Yanfeng Lu, Chenyu Yang, M. Dong

引用次数: 0

Fast locally linear embedding algorithm for exemplar-based voice conversion 基于样本的语音转换快速局部线性嵌入算法

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282112

Yu-Huai Peng, Chin-Cheng Hsu, Yi-Chiao Wu, Hsin-Te Hwang, Yi-Wen Liu, Yu Tsao, H. Wang

引用次数: 0

SIMD acceleration for HEVC encoding on DSP DSP上HEVC编码的SIMD加速

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282310

Yongfei Zhang, Rui Fan, Chao Zhang, G. Wang, Zhe Li

引用次数: 7

A comparison study of information contributions of phonemic contrasts in Mandarin 汉语音位对比信息贡献的比较研究

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282275

Yue Chen, Yanlu Xie, Jinsong Zhang

{"title":"A comparison study of information contributions of phonemic contrasts in Mandarin","authors":"Yue Chen, Yanlu Xie, Jinsong Zhang","doi":"10.1109/APSIPA.2017.8282275","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282275","url":null,"abstract":"Phonemic contrasts are the basis of speech communication. Previous studies have indicated that different phonemic contrasts have different information contributions. The inherent relationships between phonemes in information transmission can interpret various phenomena in speech and provide guidance on some studies of linguistic such as diachronic linguistics. To well reveal the distribution structure of phonemes in Chinese, this paper used multidimensional scaling to comparatively analyze the information contributions of Initials and Finals (Chinese sub-syllabic units) in Mandarin. The contributions can be quantitatively measured by functional loads (FLs). The experimental results showed that: a) Initials at the same articulation place with different manners are more likely to have higher values of FLs, Initials with same manners at different place have lower values of FLs. b) Finals sharing same onset vowels and different main vowels tend to have higher values of FLs. c) For both Initials and Finals, the closer articulation places or tongue positions of onset vowels they have, the higher values of FLs they have.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127269520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Detection of various image operations based on CNN 基于CNN的各种图像操作检测

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282267

Hongshen Tang, R. Ni, Yao Zhao, Xiaolong Li

{"title":"Detection of various image operations based on CNN","authors":"Hongshen Tang, R. Ni, Yao Zhao, Xiaolong Li","doi":"10.1109/APSIPA.2017.8282267","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282267","url":null,"abstract":"Over the past years, a number of effective digital image forensic techniques have been proposed. However, most of them design features focused on specific image operation and do binary classification, which are not very reasonable in practice and don't work for detecting other operations. To detect various image operations, in this paper, we propose a carefully crafted CNN model to learn features from the magnified images and do multi-classification automatically. Firstly, the images will be magnified by nearest neighbor interpolation in the preprocessing layer. The property of image operations can be well preserved by the nearest up-sampling. Then, hierarchical representations of different operations are learned via two multi scale convolutional layers. After that, the well-known mlpconv layers are used to enhance the whole architecture's nonlinear modeling ability and finally derive the feature map. Further more, shortcut connections between mlpconv layers allow for increasing the depth of the network while reducing information loss. We present comprehensive experiments on 6 typical image operations. The results show that the proposed method have a good performance both in binary and multi-class detection.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129151220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Min-max IIR filter design for feedback quantizers 反馈量化器的最小-最大IIR滤波器设计

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282157

S. Ohno, M. Tariq, M. Nagahara

引用次数: 3

Emotional statistical parametric speech synthesis using LSTM-RNNs 基于lstm - rnn的情绪统计参数语音合成

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282282

Shumin An, Zhenhua Ling, Lirong Dai

{"title":"Emotional statistical parametric speech synthesis using LSTM-RNNs","authors":"Shumin An, Zhenhua Ling, Lirong Dai","doi":"10.1109/APSIPA.2017.8282282","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282282","url":null,"abstract":"This paper studies the methods for emotional statistical parametric speech synthesis (SPSS) using recurrent neural networks (RNN) with long short-term memory (LSTM) units. Two modeling approaches, i.e., emotion-dependent modeling and unified modeling with emotion codes, are implemented and compared by experiments. In the first approach, LSTM-RNN- based acoustic models are built separately for each emotion type. A speaker-independent acoustic model estimated using the speech data from multi-speakers is adopted to initialize the emotion-dependent LSTM-RNNS. Inspired by the speaker code techniques developed for speech recognition and speech synthesis, the second approach builds a unified LSTM-RNN-based acoustic model using the training data of a variety of emotion types. In the unified LSTM-RNN model, an emotion code vector is input to all model layers to indicate the emotion characteristics of current utterance. Experimental results on an emotional speech synthesis database with four emotion types (neutral style, happiness, anger, and sadness) show that both approaches achieve significant better naturalness of synthetic speech than HMM-based emotion- dependent modeling. The emotion-dependent modeling approach outperforms the unified modeling approach and the HMM-based emotion-dependent modeling in terms of the subjective emotion classification rates for synthetic speech. Furthermore, the emotion codes used by the unified modeling approach are capable of controlling the emotion type and intensity of synthetic speech effectively by interpolating and extrapolating the codes in the training set.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132958509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Searchable encryption of image based on secret sharing scheme 基于秘密共享方案的图像可搜索加密

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282269

A. Kamal, Keiichi Iwamura, Hyunho Kang

引用次数: 12

Grid-free compressive beamforming using a single moving sensor of known trajectory 使用已知轨迹的单个移动传感器的无网格压缩波束形成

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282046

Y. Ang, Nam Nguyen, J. P. Lie, W. Gan

引用次数: 1