Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific最新文献

筛选
英文 中文
Robust emotion recognition in live music using noise suppression and a hierarchical sparse representation classifier 基于噪声抑制和层次稀疏表示分类器的现场音乐鲁棒情感识别
Yu-Hao Chin, Chang-Hong Lin, Jia-Ching Wang
{"title":"Robust emotion recognition in live music using noise suppression and a hierarchical sparse representation classifier","authors":"Yu-Hao Chin, Chang-Hong Lin, Jia-Ching Wang","doi":"10.1109/APSIPA.2014.7041629","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041629","url":null,"abstract":"Recognition of emotional content in music is an issue that arises recently. Music received by live applications are often exposed to noise, thus prone to reducing the recognition rate of the application. The solution proposed in this study is a robust music emotion recognition system for live applications. The proposed system consists of two major parts, i.e. subspace-based noise suppression and a hierarchical sparse representation classifier, which is based on sparse coding and a sparse representation classifier (SRC). The music is firstly enhanced by fast subspace based noise suppression. Nine classes of emotion are then used to construct a dictionary, and the vector of coefficients is obtained by sparse coding. The vector can be divided into nine parts, and each of which models a specific emotional class of a signal. Since the proposed descriptor can provide emotional content analysis of different resolutions for emotional music recognition, this work regards vectors of coefficients as feature representations. Finally, a sparse representation based classification method is employed for classification of music into four emotional classes. The experimental results confirm the highly robust performance of the proposed system in emotion recognition in live music.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124847018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text localization in natural scene images with stroke width histogram and superpixel 基于笔画宽度直方图和超像素的自然场景图像文本定位
Yu Zhou, Shuang Liu, Yongzheng Zhang, Yipeng Wang, Weiyao Lin
{"title":"Text localization in natural scene images with stroke width histogram and superpixel","authors":"Yu Zhou, Shuang Liu, Yongzheng Zhang, Yipeng Wang, Weiyao Lin","doi":"10.1109/APSIPA.2014.7041656","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041656","url":null,"abstract":"A novel stroke-based method to localize text in unconstrained natural scene images is proposed. Firstly, in order to improve the edge detection in tough situations where the texts are partially occluded or noisy, we use stroke width histogram as guidance to generate a series of superpixels. Secondly, we present a novel way of using distance transform and sobel operator to extract character skeleton and then use the skeleton to improve stroke-width accuracy. Our method was evaluated on two standard datasets: ICDAR 2005 and ICDAR 2011, and the experimental results show that it achieves state-of-the-art performance.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125206131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
High dynamic range imaging technology for micro camera array 微相机阵列的高动态范围成像技术
Po-Hsiang Huang, Yuan-Hsiang Miao, Jiun-In Guo
{"title":"High dynamic range imaging technology for micro camera array","authors":"Po-Hsiang Huang, Yuan-Hsiang Miao, Jiun-In Guo","doi":"10.1109/APSIPA.2014.7041726","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041726","url":null,"abstract":"Micro lens captures less light than normal lens dose, which makes low quality noise image, and current image sensor cannot preserve whole dynamic range in real world. HDR image with multi-exposure image overcomes the problems mentioned above. Choosing good exposure time is a seldom-discussed but important issue in HDR imaging technology. In this paper we propose a Histogram Based Exposure Time Selection (HBETS) method to automatically adjust proper exposure time of each lens for different scenes. Adopting the proposed weighting function restrains random distributed noise caused by micro-lens and produces a high quality HDR image. An integrated tone mapping methodology, which keeps all details in bright and dark parts when compressing the HDR image to LDR image for being displayed on monitors, is proposed. The result image has extended the dynamic range, that is, comprehensive information is provided. Eventually, we have implemented the proposed 4-CAM HDR system on Adlink MXC-6300 platform that can reach VGA video@10 fps.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124331580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Range reduction of HDR images for backward compatibility with LDR image processing 减少HDR图像的范围以向后兼容LDR图像处理
M. Iwahashi, Taichi Yoshida, H. Kiya
{"title":"Range reduction of HDR images for backward compatibility with LDR image processing","authors":"M. Iwahashi, Taichi Yoshida, H. Kiya","doi":"10.1109/APSIPA.2014.7041617","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041617","url":null,"abstract":"This paper proposes a new range reduction method with the minimum amount of quantization error in the L2 norm under the L infinity norm constraint. It is necessary to reduce dynamic range of pixel values of high dynamic range (HDR) images to have backward compatibility with low dynamic range image processing systems. The simplest approach is to truncate lower bit planes in binary representation of pixel values. However it does not have fine granularity of the reduced range, and also it does not utilize the histogram sparseness. Furthermore, it generates significant amount of quantization errors. In this paper, we propose a new range reduction method which can 1) utilize the histogram sparseness, and also 2) minimize variance of the error 3) under a specified maximum absolute value of the error.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124359025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Unnecessary utterance detection for avoiding digressions in discussion 讨论中避免离题的不必要话语检测
Riki Yoshida, Takuya Hiraoka, Graham Neubig, S. Sakti, T. Toda, Satoshi Nakamura
{"title":"Unnecessary utterance detection for avoiding digressions in discussion","authors":"Riki Yoshida, Takuya Hiraoka, Graham Neubig, S. Sakti, T. Toda, Satoshi Nakamura","doi":"10.1109/APSIPA.2014.7041572","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041572","url":null,"abstract":"In this paper, we propose a method for avoiding digressions in discussion by detecting unnecessary utterances and having a dialogue system intervene. The detector is based on the features using word frequency and topic shifts. The performance (i.e. accuracy, recall, precision, and F-measure) of the unnecessary utterance detector is evaluated through leave-one-dialogue-out cross-validation. In the evaluation, we find that the performance of the proposed detector is higher than that of a typical automatic summarization method.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123530340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space 一种基于情绪状态在效价激活空间位置的情绪语音合成方法
Yasuhiro Hamada, Reda Elbarougy, M. Akagi
{"title":"A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space","authors":"Yasuhiro Hamada, Reda Elbarougy, M. Akagi","doi":"10.1109/APSIPA.2014.7041729","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041729","url":null,"abstract":"Speech to Speech translation (S2ST) systems are very important for processing by which a spoken utterance in one language is used to produce a spoken output in another language. In S2ST techniques, so far, linguistic information has been mainly adopted without para- and non-linguistic information (emotion, individuality and gender, etc.). Therefore, this systems have a limitation in synthesizing affective speech, for example emotional speech, instead of neutral one. To deal with affective speech, a system that can recognize and synthesize emotional speech is required. Although most studies focused on emotions categorically, emotional styles are not categorical but continuously spread in emotion space that are spanned by two dimensions (Valence and Activation). This paper proposes a method for synthesizing emotional speech based on the positions in Valence-Activation (V-A) space. In order to model relationships between acoustic features and V-A space, Fuzzy Inference Systems (FISs) were constructed. Twenty-one acoustic features were morphed using FISs. To verify whether synthesized speech can be perceived as the same intended position in V-A space, listening tests were carried out. The results indicate that the synthesized speech can give the same impression in the V-A space as the intended speech does.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123234891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Forensics of image blurring and sharpening history based on NSCT domain 基于NSCT域的图像模糊和锐化历史取证
Yahui Liu, Yao Zhao, R. Ni
{"title":"Forensics of image blurring and sharpening history based on NSCT domain","authors":"Yahui Liu, Yao Zhao, R. Ni","doi":"10.1109/APSIPA.2014.7041728","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041728","url":null,"abstract":"Detection of multi-manipulated image has always been a more realistic direction for digital image forensic technologies, which extremely attracts interests of researchers. However, mutual affects of manipulations make it difficult to identify the process using existing single-manipulated detection methods. In this paper, a novel algorithm for detecting image manipulation history of blurring and sharpening is proposed based on non-subsampled contourlet transform (NSCT) domain. Two main sets of features are extracted from the NSCT domain: extremum feature and local directional similarity vector. Extremum feature includes multiple maximums and minimums of NSCT coefficients through every scale. Under the influence of blurring or sharpening manipulation, the extremum feature tends to gain ideal discrimination. Directional similarity feature represents the correlation of a pixel and its neighbors, which can also be altered by blurring or sharpening. For one pixel, the directional vector is composed of the coefficients from every directional subband at a certain scale. Local directional similarity vector is obtained through similarity calculation between the directional vector of one random selected pixel and the directional vectors of its 8-neighborhood pixels. With the proposed features, we are able to detect two particular operations and determine the processing order at the same time. Experiment results manifest that the proposed algorithm is effective and accurate.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126481501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A novel speech enhancement method using power spectra smooth in Wiener filtering 基于功率谱平滑的维纳滤波语音增强方法
Feng Bao, Hui-jing Dou, Mao-shen Jia, C. Bao
{"title":"A novel speech enhancement method using power spectra smooth in Wiener filtering","authors":"Feng Bao, Hui-jing Dou, Mao-shen Jia, C. Bao","doi":"10.1109/APSIPA.2014.7041526","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041526","url":null,"abstract":"In this paper, we propose a novel speech enhancement method by using power spectra smooth of the speech and noise in Wiener filtering based on the fact that a priori SNR in standard Wiener filtering reflects the power ratio of speech and noise in frequency bins. This power ratio also could be approximated by the smoothed spectra of speech and noise. We estimate the power spectra of noise and speech by means of minima controlled recursive averaging method and spectral-subtractive principle, respectively. Then, the linear prediction analysis is used to smooth power spectra of the speech and noise in frequency domain. Finally, we utilize cross-correlation between the power spectra of the noisy speech and noise to modify gains of the power spectra for further reducing noise in silence and unvoiced segments. The objective test results show that the performance of the proposed method outperforms conventional Wiener Filtering and Codebook-based methods.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130588978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Fast image matching using multi-level texture descriptor 使用多层次纹理描述符的快速图像匹配
Hui-Fuang Ng, Chih-Yang Lin, Tatenda Muindisi
{"title":"Fast image matching using multi-level texture descriptor","authors":"Hui-Fuang Ng, Chih-Yang Lin, Tatenda Muindisi","doi":"10.1109/APSIPA.2014.7041672","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041672","url":null,"abstract":"At present, image and video descriptors have been widely used in many computer vision applications. In this paper, a new hierarchical multiscale texture-based image descriptor for efficient image matching is introduced. The proposed descriptor utilizes mean values at multiscale levels of an image region to convert the image region to binary bitmaps and then applies binary operations to effectively reduce the computational time and improve noise reduction to achieve stable and fast image matching. Experimental results show high performance and robustness of our proposed method over existing descriptors on image matching under variant illumination conditions and noise.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115046664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Investigation and analysis on the effect of filtering mechanisms for 3D depth map coding 滤波机制对三维深度图编码效果的研究与分析
Xiaozhen Zheng, Weiran Li, Jianhua Zheng, Xu Chen
{"title":"Investigation and analysis on the effect of filtering mechanisms for 3D depth map coding","authors":"Xiaozhen Zheng, Weiran Li, Jianhua Zheng, Xu Chen","doi":"10.1109/APSIPA.2014.7041669","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041669","url":null,"abstract":"Depth map is a kind of video clip that contains 3D object's depth information, and is an important coding feature in the recently 3D video coding standards, which has been applied for the latest 3D coding approaches, e.g. MV-HEVC and 3D-HEVC. It has been approved that the support of depth map coding can significantly improve the coding performance for 3D videos, and provide more flexibility for 3D applications. Some previous works show that depth map has some different coding properties compared to the traditional 2D sequences. Many coding tools have different performance influence and behaviors on these two kind of video clips. This paper concentrates on the investigation and analysis of those phenomena for depth map coding.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129483180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信