2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献

筛选
英文 中文
PGT: Proposal-guided object tracking PGT:提议引导的对象跟踪
Han-Ul Kim, Chang-Su Kim
{"title":"PGT: Proposal-guided object tracking","authors":"Han-Ul Kim, Chang-Su Kim","doi":"10.1109/APSIPA.2017.8282318","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282318","url":null,"abstract":"We propose a robust visual tracking system, which refines initial estimates of a base tracker by employing object proposal techniques. First, we decompose the base tracker into three building blocks: representation method, appearance model, and model update strategy. We then design each building block by adopting and improving ideas from recent successful trackers. Second, we propose the proposal-guided tracking (PGT) algorithm. Given proposals generated by an edge-based object proposal technique, we select only the proposals that can improve the result of the base tracker using several cues. Then, we discriminate target proposals from non-target ones, based on the nearest neighbor classification using the target and background models. Finally, we replace the result of the base tracker with the best target proposal. Experimental results demonstrate that proposed PGT algorithm provides excellent results on a visual tracking benchmark.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123744247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A drag-and-drop type human computer interaction technique based on electrooculogram 一种基于眼电图的拖放式人机交互技术
S. Ogai, Toshihisa Tanaka
{"title":"A drag-and-drop type human computer interaction technique based on electrooculogram","authors":"S. Ogai, Toshihisa Tanaka","doi":"10.1109/APSIPA.2017.8282126","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282126","url":null,"abstract":"A fundamental limitation of human-computer interaction using electrooculogram (EOG) is a low accuracy of eye tracking performance and the head movement that violates the calibration of the on-monitor gaze coordinates. In this paper, we develop a drag-and-drop type interface with the EOG that can avoid a direct estimation of gaze location and can make users free from the restriction of head movement. To drag a cursor on the screen, the proposed system models the relationship between the amount of eye movement and the EOG amplitude with linear regression. Five subjects participated in the experiment to compare the proposed drag-and-drop type and the conventional direct gaze type interfaces. Performance measures such as efficiency and satisfaction showed the advantage of the proposed method with significant differences (p < 0.05).","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116113181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sound source localization using binaural difference for hose-shaped rescue robot 基于双耳差分的软管型救援机器人声源定位
Narumi Mae, Yoshiki Mitsui, S. Makino, Daichi Kitamura, Nobutaka Ono, Takeshi Yamada, H. Saruwatari
{"title":"Sound source localization using binaural difference for hose-shaped rescue robot","authors":"Narumi Mae, Yoshiki Mitsui, S. Makino, Daichi Kitamura, Nobutaka Ono, Takeshi Yamada, H. Saruwatari","doi":"10.1109/APSIPA.2017.8282292","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282292","url":null,"abstract":"Rescue robots have been developed for search and rescue operations in times of large-scale disasters. Such a robot is used to search for survivors in disaster sites by capturing their voices with its microphone array. However, since the robot has many vibration motors, ego noise is mixed with voices, and it is difficult to differentiate the ego noise from a call for help from a disaster survivor. In our previous works, an ego noise reduction technique that combines a method of blind source separation called independent low-rank matrix analysis and postprocessing for noise cancellation was proposed. In the practical use of this robot, to determine the precise location of survivors, the direction of the observed voice should be estimated after the ego noise reduction process. To achieve this objective, in this study, a new hose-shaped rescue robot with microphone arrays was developed. Moreover, we adapt postfilter called MOSIE to our previous noise reduction method to listen to stereo sound because this robot can record stereo sound. By performing in a simulated disaster site, we confirm that the operator can perceive the direction of a survivor's location by applying a speech enhancement technique combining independent low-rank matrix analysis, noise cancellation, and postfiltering to the observed multichannel noisy signals.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121112812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Real-time digitized neural-spike storage scheme in multiple channels for biomedical applications 生物医学应用中多通道实时数字化神经脉冲存储方案
Anand Kumar Mukhopadhyay, I. Chakrabarti, M. Sharad
{"title":"Real-time digitized neural-spike storage scheme in multiple channels for biomedical applications","authors":"Anand Kumar Mukhopadhyay, I. Chakrabarti, M. Sharad","doi":"10.1109/APSIPA.2017.8282256","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282256","url":null,"abstract":"The recording of real time Neural-spikes (N-spikes) into an on-chip memory module is essential for processing the stored information having use in neurological applications like neural spike sorting. Spike sorting is a process used in bio-medical signal processing where incoming real-time spikes are mapped to the neuron from which it originates. In this paper, power and area efficient architectural level storage schemes of digitized N-spikes recorded through multiple channels into a Single Port Random Access Memory (SPRAM) module have been compared. The power dissipation of the proposed storage scheme is in the order of few μW. The architectural level analysis of the schemes has been performed in 0.18μm CMOS process technology using the Synopsys design compiler tool.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126781324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An investigation to transplant emotional expressions in DNN-based TTS synthesis 基于dnn的TTS合成中情感表达移植的研究
Katsuki Inoue, Sunao Hara, M. Abe, Nobukatsu Hojo, Yusuke Ijima
{"title":"An investigation to transplant emotional expressions in DNN-based TTS synthesis","authors":"Katsuki Inoue, Sunao Hara, M. Abe, Nobukatsu Hojo, Yusuke Ijima","doi":"10.1109/APSIPA.2017.8282231","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282231","url":null,"abstract":"In this paper, we investigate deep neural network (DNN) architectures to transplant emotional expressions to improve the expressiveness of DNN-based text-to-speech (TTS) synthesis. DNN is expected to have potential power in mapping between linguistic information and acoustic features. From multispeaker and/or multi-language perspectives, several types of DNN architecture have been proposed and have shown good performances. We tried to expand the idea to transplant emotion, constructing shared emotion-dependent mappings. The following three types of DNN architecture are examined; (1) the parallel model (PM) with an output layer consisting of both speaker- dependent layers and emotion-dependent layers, (2) the serial model (SM) with an output layer consisting of emotion-dependent layers preceded by speaker-dependent hidden layers, (3) the auxiliary input model (AIM) with an input layer consisting of emotion and speaker IDs as well as linguistics feature vectors. The DNNs were trained using neutral speech uttered by 24 speakers, and sad speech and joyful speech uttered by 3 speakers from those 24 speakers. In terms of unseen emotional synthesis, subjective evaluation tests showed that the PM performs much better than the SM and slightly better than the AIM. In addition, this test showed that the SM is the best of the three models when training data includes emotional speech uttered by the target speaker.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125928043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Blind speaker counting in highly reverberant environments by clustering coherence features 基于聚类相干特性的高混响环境下盲说话人计数
Shahab Pasha, Jacob Donley, C. Ritz
{"title":"Blind speaker counting in highly reverberant environments by clustering coherence features","authors":"Shahab Pasha, Jacob Donley, C. Ritz","doi":"10.1109/APSIPA.2017.8282303","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282303","url":null,"abstract":"This paper proposes the use of the frequency- domain Magnitude Squared Coherence (MSC) between two ad- hoc recordings of speech as a reliable speaker discrimination feature for source counting applications in highly reverberant environments. The proposed source counting method does not require knowledge of the microphone spacing and does not assume any relative distance between the sources and the microphones. Source counting is based on clustering the frequency domain MSC of the speech signals derived from short time segments. Experiments show that the frequency domain MSC is speaker-dependent and the method was successfully used to obtain highly accurate source counting results for up to six active speakers for varying levels of reverberation and microphone spacing.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124137910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Joint estimation of signal and mutual coupling parameters based on spatially spread polarization sensitive array 基于空间扩频极化敏感阵列的信号和互耦参数联合估计
Huiyong Li, Zihui Luo, Julan Xie, Jun Li
{"title":"Joint estimation of signal and mutual coupling parameters based on spatially spread polarization sensitive array","authors":"Huiyong Li, Zihui Luo, Julan Xie, Jun Li","doi":"10.1109/APSIPA.2017.8282156","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282156","url":null,"abstract":"A reduced-dimensional MUSIC (RD-MUSIC) algorithm is proposed to reduce the computation of blind joint direction-of-arrival (DOA), polarization and the mutual coupling parameters estimation algorithm based on spatially spread polarization sensitive uniform linear array (ULA). This algorithm works in two steps. In the first step, DOA parameter and polarization parameters are separated from the mutual coupling through matrix transformation, which are then estimated by RD-MUSIC algorithm. In the second step, the mutual coupling coefficients are estimated via eigen-decomposition with modulus constraint. Simulation results show the effectiveness of the proposed method for joint signal parameters and mutual coupling coefficients estimation.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127734047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Development of a multi-modal personal authentication interface 开发一个多模态的个人认证接口
Sung-Phil Kim, Jae-Hwan Kang, Y. Jo, Ian Oakley
{"title":"Development of a multi-modal personal authentication interface","authors":"Sung-Phil Kim, Jae-Hwan Kang, Y. Jo, Ian Oakley","doi":"10.1109/APSIPA.2017.8282125","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282125","url":null,"abstract":"Recent advances have brought biometrie user interfaces such as fingerprint and iris to the users' daily lives. More advanced biometric techniques are on the verge of development and commercialization, with increasing levels of security. This paper presents recent work on the development of a multi-factor personal authentication system. The proposed system is based on unique cognitive responses of a user to predetermined stimuli. Biometric signals such as brain activity are used to measure cognitive responses. The approach to implement such a system and test authentication results are presented. Discussion includes the feasibility of the system as well as potential scenarios of using multi-factor authentication interfaces.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127951713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Online sound structure analysis based on generative model of acoustic feature sequences 基于声特征序列生成模型的在线声结构分析
Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita
{"title":"Online sound structure analysis based on generative model of acoustic feature sequences","authors":"Keisuke Imoto, Nobutaka Ono, M. Niitsuma, Y. Yamashita","doi":"10.1109/APSIPA.2017.8282236","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282236","url":null,"abstract":"We propose a method for the online sound structure analysis based on a Bayesian generative model of acoustic feature sequences, with which the hierarchical generative process of the sound clip, acoustic topic, acoustic word, and acoustic feature is assumed. In this model, it is assumed that sound clips are organized based on the combination of latent acoustic topics, and each acoustic topic is represented by a Gaussian mixture model (GMM) over an acoustic feature space, where the components of the GMM correspond to acoustic words. Since the conventional batch algorithm for learning this model requires a huge amount of calculation, it is difficult to analyze the massive amount of sound data. Moreover, the batch algorithm does not allow us to analyze the sequentially obtained data. Our variational Bayes-based online algorithm for this generative model can analyze the structure of sounds sound clip by sound clip. The experimental results show that the proposed online algorithm can reduce the calculation cost by about 90% and estimate the posterior distributions as efficiently as the conventional batch algorithm.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132423297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding multiple-input multiple-output active noise control from a perspective of sampling and reconstruction 从采样和重构的角度理解多输入多输出主动噪声控制
Chuang Shi, Huiyong Li, Dongyuan Shi, Bhan Lam, W. Gan
{"title":"Understanding multiple-input multiple-output active noise control from a perspective of sampling and reconstruction","authors":"Chuang Shi, Huiyong Li, Dongyuan Shi, Bhan Lam, W. Gan","doi":"10.1109/APSIPA.2017.8282013","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282013","url":null,"abstract":"This paper formulates the multiple-input multiple- output active noise control as a spatial sampling and reconstruction problem. With the proposed formulation, the inputs from the reference microphones and the outputs of the antinoise sources are regarded as spatial samples. We show that the proposed formulation is general and can unify the existing control strategies. Three control strategies, for instance, are derived from the proposed formulation and linked to different cost functions in the practical implementation. Finally, simulation results are presented to verify the effectiveness of our analysis.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114895755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信