2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)最新文献

Pose-invariant kinematic features for action recognition 动作识别的位姿不变运动特征

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-15 DOI: 10.1109/APSIPA.2017.8282038

M. Ramanathan, W. Yau, E. Teoh, N. Magnenat-Thalmann

{"title":"Pose-invariant kinematic features for action recognition","authors":"M. Ramanathan, W. Yau, E. Teoh, N. Magnenat-Thalmann","doi":"10.1109/APSIPA.2017.8282038","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282038","url":null,"abstract":"Recognition of actions from videos is a difficult task due to several factors like dynamic backgrounds, occlusion, pose-variations observed. To tackle the pose variation problem, we propose a simple method based on a novel set of pose-invariant kinematic features which are encoded in a human body centric space. The proposed framework begins with detection of neck point, which will serve as a origin of body centric space. We propose a deep learning based classifier to detect neck point based on the output of fully connected network layer. With the help of the detected neck, propagation mechanism is proposed to divide the foreground region into head, torso and leg grids. The motion observed in each of these body part grids are represented using a set of pose-invariant kinematic features. These features represent motion of foreground or body region with respect to the detected neck point's motion and encoded based on view in a human body centric space. Based on these features, poseinvariant action recognition can be achieved. Due to the body centric space is used, non-upright human posture actions can also be handled easily. To test its effectiveness in non-upright human postures in actions, a new dataset is introduced with 8 non-upright actions performed by 35 subjects in 3 different views. Experiments have been conducted on benchmark and newly proposed non-upright action dataset to identify limitations and get insights on the proposed framework.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133865510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Locomotion control of a serpentine crawling robot inspired by central pattern generators 基于中心模式发生器的蛇形爬行机器人运动控制

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-15 DOI: 10.1109/APSIPA.2017.8282067

Jiadong Wang, Wenjuan Ouyang, Wenchao Gao, Qinyuan Ren

引用次数: 2

On the construction of more human-like chatbots: Affect and emotion analysis of movie dialogue data 关于构建更像人类的聊天机器人:电影对白数据的情感和情感分析

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-15 DOI: 10.1109/APSIPA.2017.8282245

Rafael E. Banchs

引用次数: 13

Robust template matching using scale-adaptive deep convolutional features 使用尺度自适应深度卷积特征的鲁棒模板匹配

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-14 DOI: 10.1109/APSIPA.2017.8282124

Jonghee Kim, Jinsu Kim, Seokeon Choi, Muhammad Abul Hasan, Changick Kim

{"title":"Robust template matching using scale-adaptive deep convolutional features","authors":"Jonghee Kim, Jinsu Kim, Seokeon Choi, Muhammad Abul Hasan, Changick Kim","doi":"10.1109/APSIPA.2017.8282124","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282124","url":null,"abstract":"In this paper, we propose a deep convolutional feature-based robust and efficient template matching method. The originality of the proposed method is that it is based on a scale-adaptive feature extraction approach. This approach is influenced by an observation that each layer in a CNN represents a different level of deep features of the actual image contents. In order to keep the features scalable, we extract deep feature vectors of the template and the input image adaptively from a layer of a CNN. By using such scalable and deep representation of the image contents, we attempt to solve the template matching by measuring the similarity between the features of the template and the input image using an efficient similarity measuring technique called normalized cross-correlation (NCC). Using NCC helps in avoiding redundant computations of adjacent patches caused by the sliding window approach. As a result, the proposed method achieves state-of-the-art template matching performance and lowers the computational cost significantly than the state-of- the-art methods in the literature.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124954486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

CNN-based bottleneck feature for noise robust query-by-example spoken term detection 基于cnn的噪声鲁棒样例查询语音词检测瓶颈特征

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-14 DOI: 10.1109/APSIPA.2017.8282220

Hyungjun Lim, Younggwan Kim, Yoonhoe Kim, Hoirin Kim

引用次数: 9

A perception system for robot arms to convey objects to in-car passengers 机器人手臂的感知系统，将物体传递给车内乘客

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-12 DOI: 10.1109/APSIPA.2017.8282065

Li Jun, Tee Keng Peng, C. Lawrence, Wan Kong Wah, Yau Wei Yun

{"title":"A perception system for robot arms to convey objects to in-car passengers","authors":"Li Jun, Tee Keng Peng, C. Lawrence, Wan Kong Wah, Yau Wei Yun","doi":"10.1109/APSIPA.2017.8282065","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282065","url":null,"abstract":"Automatically delivering objects to in-car passengers has many potential applications. Such a system generally consists of two sub-systems: a perception system and an action system. The perception system basically looks for the targets' positions and the action system delivers objects to the targets. In this paper, we propose a novel perception system, which contains two major functions: estimation of reaching points and discovering potential risks. The reaching points are the locations where robot arms needs to reach. Moreover, it should be able to reach with comfort by passengers and keep a safe distance from the car body. In order to achieve this, all the vehicle components (side surfaces, side mirrors etc.), which may cause collision, need to be detected. Potential risks are usually caused by moving objects or changing door state (close to open) during the operation. It is necessary to monitor these two situations to avoid any potential risks during operation. Our offline test shows that the accuracy of reaching points estimation can reach up to 94% and the response time for moving objects detection or door state changes is less than 1 millisecond.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116042461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Motion planning of a 6-Dofs robot arm for bandaging nursing task 用于包扎护理任务的6自由度机械臂运动规划

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282066

Yi Feng, Zhifeng Huang, Yun Zhang

引用次数: 3

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling 基于lda混合建模的n-gram和RNN语言模型联合无监督自适应

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282277

Ryo Masumura, Taichi Asami, H. Masataki, Y. Aono

{"title":"Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling","authors":"Ryo Masumura, Taichi Asami, H. Masataki, Y. Aono","doi":"10.1109/APSIPA.2017.8282277","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282277","url":null,"abstract":"This paper reports an initial study of unsupervised adaptation that assumes simultaneous use of both n-gram and recurrent neural network (RNN) language models (LMs) in automatic speech recognition (ASR). It is known that a combination of n-grams and RNN LMs is a more effective approach to ASR than using each of them singly. However, unsupervised adaptation methods that simultaneously adapt both n-grams and RNN LMs have not been presented while various unsupervised adaptation methods specific to either n-gram LMs or RNN LMs have been examined. In order to handle different LMs in a unified unsupervised adaptation framework, our key idea is to introduce mixture modeling for both n-gram LMs and RNN LMs. The mixture modeling can simultaneously handle multiple LMs and unsupervised adaptation can be easily accomplished merely by adjusting their mixture weights using a recognition hypothesis of an input speech. This paper proposes joint unsupervised adaptation achieved by a hybrid mixture modeling using both n-gram mixture models and RNN mixture models. We present latent Dirichlet allocation based hybrid mixture modeling for effective topic adaptation. Our experiments in lecture ASR tasks show the effectiveness of joint unsupervised adaptation. We also reveal performance in which only one n-gram or RNN LM is adapted.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115242580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A free Kazakh speech database and a speech recognition baseline 一个免费的哈萨克语语音数据库和语音识别基线

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282133

Ying Shi, Askar Hamdullah, Zhiyuan Tang, Dong Wang, T. Zheng

{"title":"A free Kazakh speech database and a speech recognition baseline","authors":"Ying Shi, Askar Hamdullah, Zhiyuan Tang, Dong Wang, T. Zheng","doi":"10.1109/APSIPA.2017.8282133","DOIUrl":"https://doi.org/10.1109/APSIPA.2017.8282133","url":null,"abstract":"Automatic speech recognition (ASR) has gained significant improvement for major languages such as English and Chinese, partly due to the emergence of deep neural networks (DNN) and large amount of training data. For minority languages, however, the progress is largely behind the main stream. A particularly obstacle is that there are almost no large-scale speech databases for minority languages, and the only few databases are held by some institutes as private properties, far from open and standard, and very few are free. Besides the speech database, phonetic and linguistic resources are also scarce, including phone set, lexicon, and language model. In this paper, we publish a speech database in Kazakh, a major minority language in the western China. Accompanying this database, a full set of phonetic and linguistic resources are also published, by which a full-fledged Kazakh ASR system can be constructed. We will describe the recipe for constructing a baseline system, and report our present results. The resources are free for research institutes and can be obtained by request. The publication is supported by the M2ASR project supported by NSFC, which aims to build multilingual ASR systems for minority languages in China.","PeriodicalId":142091,"journal":{"name":"2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115682427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Development of under-resourced Bahasa Indonesia speech corpus 发展资源不足的印尼语语料库

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Pub Date : 2017-12-01 DOI: 10.1109/APSIPA.2017.8282191

E. Cahyaningtyas, D. Arifianto

引用次数: 7