Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific最新文献_第10页

Robust emotion recognition in live music using noise suppression and a hierarchical sparse representation classifier 基于噪声抑制和层次稀疏表示分类器的现场音乐鲁棒情感识别

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041629

Yu-Hao Chin, Chang-Hong Lin, Jia-Ching Wang

{"title":"Robust emotion recognition in live music using noise suppression and a hierarchical sparse representation classifier","authors":"Yu-Hao Chin, Chang-Hong Lin, Jia-Ching Wang","doi":"10.1109/APSIPA.2014.7041629","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041629","url":null,"abstract":"Recognition of emotional content in music is an issue that arises recently. Music received by live applications are often exposed to noise, thus prone to reducing the recognition rate of the application. The solution proposed in this study is a robust music emotion recognition system for live applications. The proposed system consists of two major parts, i.e. subspace-based noise suppression and a hierarchical sparse representation classifier, which is based on sparse coding and a sparse representation classifier (SRC). The music is firstly enhanced by fast subspace based noise suppression. Nine classes of emotion are then used to construct a dictionary, and the vector of coefficients is obtained by sparse coding. The vector can be divided into nine parts, and each of which models a specific emotional class of a signal. Since the proposed descriptor can provide emotional content analysis of different resolutions for emotional music recognition, this work regards vectors of coefficients as feature representations. Finally, a sparse representation based classification method is employed for classification of music into four emotional classes. The experimental results confirm the highly robust performance of the proposed system in emotion recognition in live music.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124847018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Text localization in natural scene images with stroke width histogram and superpixel 基于笔画宽度直方图和超像素的自然场景图像文本定位

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041656

Yu Zhou, Shuang Liu, Yongzheng Zhang, Yipeng Wang, Weiyao Lin

引用次数: 7

High dynamic range imaging technology for micro camera array 微相机阵列的高动态范围成像技术

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041726

Po-Hsiang Huang, Yuan-Hsiang Miao, Jiun-In Guo

引用次数: 3

Range reduction of HDR images for backward compatibility with LDR image processing 减少HDR图像的范围以向后兼容LDR图像处理

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041617

M. Iwahashi, Taichi Yoshida, H. Kiya

引用次数: 4

Unnecessary utterance detection for avoiding digressions in discussion 讨论中避免离题的不必要话语检测

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041572

Riki Yoshida, Takuya Hiraoka, Graham Neubig, S. Sakti, T. Toda, Satoshi Nakamura

引用次数: 0

A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space 一种基于情绪状态在效价激活空间位置的情绪语音合成方法

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041729

Yasuhiro Hamada, Reda Elbarougy, M. Akagi

{"title":"A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space","authors":"Yasuhiro Hamada, Reda Elbarougy, M. Akagi","doi":"10.1109/APSIPA.2014.7041729","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041729","url":null,"abstract":"Speech to Speech translation (S2ST) systems are very important for processing by which a spoken utterance in one language is used to produce a spoken output in another language. In S2ST techniques, so far, linguistic information has been mainly adopted without para- and non-linguistic information (emotion, individuality and gender, etc.). Therefore, this systems have a limitation in synthesizing affective speech, for example emotional speech, instead of neutral one. To deal with affective speech, a system that can recognize and synthesize emotional speech is required. Although most studies focused on emotions categorically, emotional styles are not categorical but continuously spread in emotion space that are spanned by two dimensions (Valence and Activation). This paper proposes a method for synthesizing emotional speech based on the positions in Valence-Activation (V-A) space. In order to model relationships between acoustic features and V-A space, Fuzzy Inference Systems (FISs) were constructed. Twenty-one acoustic features were morphed using FISs. To verify whether synthesized speech can be perceived as the same intended position in V-A space, listening tests were carried out. The results indicate that the synthesized speech can give the same impression in the V-A space as the intended speech does.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"603 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123234891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Forensics of image blurring and sharpening history based on NSCT domain 基于NSCT域的图像模糊和锐化历史取证

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041728

Yahui Liu, Yao Zhao, R. Ni

{"title":"Forensics of image blurring and sharpening history based on NSCT domain","authors":"Yahui Liu, Yao Zhao, R. Ni","doi":"10.1109/APSIPA.2014.7041728","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041728","url":null,"abstract":"Detection of multi-manipulated image has always been a more realistic direction for digital image forensic technologies, which extremely attracts interests of researchers. However, mutual affects of manipulations make it difficult to identify the process using existing single-manipulated detection methods. In this paper, a novel algorithm for detecting image manipulation history of blurring and sharpening is proposed based on non-subsampled contourlet transform (NSCT) domain. Two main sets of features are extracted from the NSCT domain: extremum feature and local directional similarity vector. Extremum feature includes multiple maximums and minimums of NSCT coefficients through every scale. Under the influence of blurring or sharpening manipulation, the extremum feature tends to gain ideal discrimination. Directional similarity feature represents the correlation of a pixel and its neighbors, which can also be altered by blurring or sharpening. For one pixel, the directional vector is composed of the coefficients from every directional subband at a certain scale. Local directional similarity vector is obtained through similarity calculation between the directional vector of one random selected pixel and the directional vectors of its 8-neighborhood pixels. With the proposed features, we are able to detect two particular operations and determine the processing order at the same time. Experiment results manifest that the proposed algorithm is effective and accurate.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126481501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A novel speech enhancement method using power spectra smooth in Wiener filtering 基于功率谱平滑的维纳滤波语音增强方法

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041526

Feng Bao, Hui-jing Dou, Mao-shen Jia, C. Bao

引用次数: 8

Fast image matching using multi-level texture descriptor 使用多层次纹理描述符的快速图像匹配

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041672

Hui-Fuang Ng, Chih-Yang Lin, Tatenda Muindisi

引用次数: 1

Investigation and analysis on the effect of filtering mechanisms for 3D depth map coding 滤波机制对三维深度图编码效果的研究与分析

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific Pub Date : 2014-12-01 DOI: 10.1109/APSIPA.2014.7041669

Xiaozhen Zheng, Weiran Li, Jianhua Zheng, Xu Chen

引用次数: 0