2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献_第9页

Measuring Infant's Length with an Image 用图像测量婴儿的长度

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659482

Maolong Tang, Ming-Ting Sun, Leonardo Seda, J. Swanson, Zhengyou Zhang

引用次数: 1

Semi-Supervised NMF in the chroma Domain Applied to Music Harmony Estimation 色度域半监督NMF在音乐和声估计中的应用

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659645

Takuya Takahashi, T. Hori, Christoph M. Wilk, S. Sagayama

{"title":"Semi-Supervised NMF in the chroma Domain Applied to Music Harmony Estimation","authors":"Takuya Takahashi, T. Hori, Christoph M. Wilk, S. Sagayama","doi":"10.23919/APSIPA.2018.8659645","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659645","url":null,"abstract":"In this paper, we discuss non-negative matrix factorization (NMF) applied to chroma feature sequences to reduce the chroma-specific noise in chord estimation from music signals using the hidden Markov model (HMM). Even in the case of single pitch sounds, the raw 12-dimensional chroma vectors obtained from the music signal by summing and normalizing the spectrum by octaves often contain irrelevant components such as non-octave overtones falling into different pitch classes and cause inaccuracies in estimation of harmonies. NMF applied to the chroma domain is expected to suppress such chroma components in the NMF activation matrix caused by overtones, and thus “purifies” the noisy chroma vectors. By reducing the dimensionality to 12 dimensions as opposed to NMF applied to the raw spectrum, we expect advantages with respect to statistical robustness as well as computational cost for pitch class estimation of single and multiple tones. We use the “purified” chroma vectors in combination with a harmony progression model based on an HMM where the NMF activation distributions are modeled as observations associated with hidden harmonies, whose transition probabilities have been obtained statistically. We attempt to improve harmony estimation accuracy by combining suppression of irrelevant components and the HMM-based harmony model. In the experimental evaluation, we demonstrate the reduction of irrelevant components in raw chroma vectors computed from recordings of musical instruments. In addition, using music audio data with harmony annotation from the RWC database, we compare the harmony estimation accuracies using our method and conventional chroma.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126170805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

APSIPA ASC 2018 Organizing Committee APSIPA ASC 2018组委会

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/apsipa.2018.8659710

引用次数: 0

Real-time Background Subtraction via L1 Norm Tensor Decomposition 基于L1范数张量分解的实时背景减法

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659727

Taehyeong Kim, Yoonsik Choe

{"title":"Real-time Background Subtraction via L1 Norm Tensor Decomposition","authors":"Taehyeong Kim, Yoonsik Choe","doi":"10.23919/APSIPA.2018.8659727","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659727","url":null,"abstract":"Currently, background subtraction is being actively studied in many image processing applications. Nuclear Norm Minimization (NNM) and Weighted Nuclear Norm Minimization (WNNM) are commonly used background subtraction methods based on Robust Principal Component Analysis (RPCA). However, these techniques approximate the RPCA rank function and take the form of an iterative optimization algorithm. Therefore, due to the approximation, the NNM solution can not converge if the number of frames is small. In addition, the NNM and WNNM processing times are delayed because of their iterative optimization schemes. Thus, NNM and WNNM are not suitable for real-time background subtraction. In order to overcome these limitations, this paper presents a real-time background subtraction method using tensor decomposition in accordance with the recent tensor analysis research trend. In this study, we used the closed form TUCKER2 decomposition solution to omit the iterative process while retaining the L1 norm of the RPCA rank function. This proposed method allows for convergence even when the number of frames is small. Compared to NNM and WNNM, the proposed method reduces the processing time by more than 80 times and has a higher precision even when the number of frames are less than 10.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129783570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A deep learning based framework for converting sign language to emotional speech 一个基于深度学习的框架，用于将手语转换为情感语言

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659571

Nan Song, Hongwu Yang, Pengpeng Zhi

{"title":"A deep learning based framework for converting sign language to emotional speech","authors":"Nan Song, Hongwu Yang, Pengpeng Zhi","doi":"10.23919/APSIPA.2018.8659571","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659571","url":null,"abstract":"This paper proposes a framework for converting sign language to emotional speech by deep learning. We firstly adopt a deep neural network (DNN) model to extract the features of sign language and facial expression. Then we train two support vector machines (SVM) to classify the sign language and facial expression for recognizing the text of sign language and emotional tags of facial expression. We also train a set of DNN-based emotional speech acoustic models by speaker adaptive training with an multi-speaker emotional speech corpus. Finally, we select the DNN-based emotional speech acoustic models with emotion tags to synthesize emotional speech from the text recognized from the sign language. Objective tests show that the recognition rate for static sign language is 90.7%. The recognition rate of facial expression achieves 94.6% on the extended Cohn-Kanade database (CK+) and 80.3% on the Japanese Female Facial Expression (JAFFE) database respectively. Subjective evaluation demonstrates that synthesized emotional speech can get 4.2 of the emotional mean opinion score. The pleasure-arousal-dominance (PAD) tree dimensional emotion model is employed to evaluate the PAD values for both facial expression and synthesized emotional speech. Results show that the PAD values of facial expression are close to the PAD values of synthesized emotional speech. This means that the synthesized emotional speech can express the emotions of facial expression.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129258435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Agglomerative Hierarchical Clustering of Basis Vector for Monaural Sound Source Separation Based on NMF 基于NMF的单声源分离基向量的聚类层次聚类

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659766

Kenta Murai, Taiho Takeuchi, Y. Tatekura

{"title":"Agglomerative Hierarchical Clustering of Basis Vector for Monaural Sound Source Separation Based on NMF","authors":"Kenta Murai, Taiho Takeuchi, Y. Tatekura","doi":"10.23919/APSIPA.2018.8659766","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659766","url":null,"abstract":"This paper proposes a method of monaural sound source separation by clustering based on the similarity of basis vectors decomposed by Non-negative Matrix Factorization (NMF). In the proposed method, the basis vectors are clustered on the assumption that the similarity between the basis vectors constituting the target sound source is higher than the similarity with the basis vectors of the other sound sources. Hierarchical clustering, which forms clusters in descending order of feature similarity, is introduced. Since it is unnecessary to explicitly determine the number of clusters in hierarchical clustering, hierarchical clustering can be classified into an optional number of clusters according to the threshold. Therefore, the proposed method can separate to an optional number of sound sources. From the numerical evaluation result, it was found that the Signal to Distortion Ratio (SDR), which is an evaluation index of sound source separation, can be improved by approximately 6 to 10 dB. Undesirable cases in which most of the basis vectors are classified into the same cluster are also discussed. In addition, sound source separation with mixed three mixed sound sources was also evaluated, and it was confirmed that SDR can be improved by about 10 dB.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128289043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Study on HDR/WCG Service Model for UHD Service 面向超高清业务的HDR/WCG业务模型研究

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659594

Juhan Bae, Jeongyeon Lim, So-Ki Jung

引用次数: 0

Log-based Anomalies Detection of MANETs Routing with Reasoning and Verification 基于日志的manet路由异常检测及其推理与验证

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659549

Teng Li, Jianfeng Ma, Qingqi Pei, Yulong Shen, Cong Sun

{"title":"Log-based Anomalies Detection of MANETs Routing with Reasoning and Verification","authors":"Teng Li, Jianfeng Ma, Qingqi Pei, Yulong Shen, Cong Sun","doi":"10.23919/APSIPA.2018.8659549","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659549","url":null,"abstract":"Routing security plays an important role in Mobile Ad hoc Networks (MANETs). Despite many attempts to improve its security, the routing procedure of MANETs remains vulnerable to attacks. Existing approaches offer support for detecting attacks or debugging in different routing phases, but many of them have not considered the privacy of the nodes during the anomalies detection, which depend on the central control program or a third party to supervise the whole network. In this paper, we present an approach called LAD which uses the raw logs of routers to construct control a flow graph and find the existing communication rules in MANETs. With the reasoning rules, LAD can detect both active and passive attacks launched during the routing phase. LAD can also protect the privacy of the nodes in the verification phase with the specific Merkle hash tree. Without deploying any special nodes to assist the verification, LAD can detect multiple malicious nodes by itself. To show that our approach can be used to guarantee the security of the MANETs, we deploy our experiment in NS3 as well as the practical router environment. LAD can improve the accuracy rate from 2.28% to 29.22%. The results show that LAD performs limited time and memory usages, high detection and low false positives.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116514267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

[Copyright notice] (版权)

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/apsipa.2018.8659690

引用次数: 0

Investigating Co-Prime Microphone Arrays for Speech Direction of Arrival Estimation 用于语音到达方向估计的共素麦克风阵列研究

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2018-11-01 DOI: 10.23919/APSIPA.2018.8659626

Jiahong Zhao, C. Ritz

{"title":"Investigating Co-Prime Microphone Arrays for Speech Direction of Arrival Estimation","authors":"Jiahong Zhao, C. Ritz","doi":"10.23919/APSIPA.2018.8659626","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659626","url":null,"abstract":"This paper investigates the application of the steered response power - phase transform (SRP-PHAT) method to coprime microphone array (CPMA) recordings to estimate the direction of arrival (DOA) of speech sources. While existing CPMA approaches for acoustics applications are limited, especially under reverberant conditions, the proposed algorithm utilises SRP-PHAT to estimate the DOA of speech sources and then employs a histogram-based stochastic algorithm using steered response power (SRP) adjustment and kernel density evaluation (KDE) to improve the DOA estimation accuracy. Experiments are conducted for up to three simultaneous speech sources in the far field considering both anechoic and reverberant scenarios. Results suggest that the proposed approach achieves more accurate DOA estimates than a uniform linear array (ULA) with the same number of microphones under both anechoic and low reverberant conditions, and it significantly decreases the number of microphones of another equivalent ULA while maintaining similar performances. Moreover, the operating frequency of the microphone array is largely increased without changing the number of microphones, making it possible to accurately record higher-frequency components of source signals.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127104814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7