Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino
{"title":"Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals","authors":"Hideki Kawahara, M. Morise, T. Toda, Hideki Banno, R. Nisimura, T. Irino","doi":"10.1109/APSIPA.2014.7041594","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041594","url":null,"abstract":"A new group delay representation, which yields value zero for periodic signals irrespective to the initial phase and the relative level of each harmonic component. This new group delay representation provides a unified basis for defining \"aperiodicity\" in speech sounds. For example, the periodic to noise ratio or harmonic to noise ratio is directly derived from the deviation of this group delay representation from value zero, after removing FM effects of harmonic frequencies and removing AM effects of harmonic component level. The derived deviation is combined with estimated excitation duration information and used to design aperiodic components of excitation source for high-quality synthetic speech. The proposed group delay representation is based on FO-adaptive weighted average of frequency shifted versions and temporally shifted versions of group delays with power spectral weighting.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130557938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quality enhancement for feature matching on car black box videos","authors":"C. Simon, Man Hee Lee, I. Park","doi":"10.1109/APSIPA.2014.7041750","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041750","url":null,"abstract":"Video has difficulty to maintain consistent intensity and color tone from frame to frame. Particularly, it happens when imaging device such as black box camera has to deal with fast changing illumination environment. However, conventional automatic white balance algorithms cannot handle this good enough to maintain tone consistency, which is observed in most commercial black box products. In this paper, a novel tone stabilization is proposed to enhance the performance of further applied algorithms like detecting and matching visual features across video frames. The proposed technique utilizes multiple anchor frames as references to smooth tone fluctuation between them. Experimental result shows the improvement of tone consistency as well as feature detection and matching accuracy on car black box videos with varying tone over time.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127883552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved robustness of biometrie authentication system using features of utterance","authors":"Qian Shi, Y. Kajikawa","doi":"10.1109/APSIPA.2014.7041562","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041562","url":null,"abstract":"In this paper, we propose a novel biometric authentication system using motion vectors of lips. We have already proposed a biometric authentication system using multimodal features of utterance. However, since both the edges and texture of lips can be easily extracted from a still image, an imposter may be recognized as a registrant by using a still image of the registrant. Therefore, the robustness of our biometric authentication system must be enhanced. Hence, we utilize motion vectors of lips as a feature. The proposed authentication system utilizes physical traits (edges and texture) and a behavioral trait (motion vectors) in the lip region to improve the security. Experimental results demonstrate that motion vectors in the lip region are effective for improving the robustness against imposters and can increase the authentication rate.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127914039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Srisuwan, Michael Wand, M. Janke, P. Phukpattaranont, Tanja Schultz, C. Limsakul
{"title":"Enhancement of EMG-based Thai number words classification using frame-based time domain features with stacking filter","authors":"N. Srisuwan, Michael Wand, M. Janke, P. Phukpattaranont, Tanja Schultz, C. Limsakul","doi":"10.1109/APSIPA.2014.7041549","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041549","url":null,"abstract":"In order to overcome a problem existing in a classical automatic speech recognition (e.g. ambient noise and loss of privacy), Electromyography (EMG) from speech production muscles was used in place of a human speech signal. We aim to investigate the EMG speech recognition based on Thai language. The earlier work, we used five channels of the EMG from the facial and neck muscles to classify 11 Thai number words based on Neural Network Classification. 15 features in time domain and frequency domain were employed for feature extraction. We obtained an average accuracy rate of 89.45% for audible speech and 78.55% for silent speech. However, it needs to be enhanced to get the best result. This paper proposes to improve an accuracy rate of EMG-based Thai number words classification. The ten subjects uttered 11 words in both an audible and a silent speech while five channels of the EMG signal were captured. Frame-based time domain features with a stacking filter was performed for feature extraction stage. After that, LDA was used to lessen a dimension of the feature vector. Hidden Markov Model (HMM) was employed in classification stage. The results show that using above techniques of feature extraction, feature dimensionality reduction and classification can improve an average accuracy rate by 3% absolute for audible speech when were compared to earlier work. We achieved an average classification rate of 92.45% and 75.73% for audible and silent speech respectively.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127938852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combined per-user SLNR and SINR criterions for interference alignment in uplink coordinated multi-point joint reception","authors":"A. E. Rakhmania, P. Tsai, O. Setyawati","doi":"10.1109/APSIPA.2014.7041771","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041771","url":null,"abstract":"An interference alignment (IA) algorithm for uplink coordinated multi-point (CoMP) is proposed. For the design of precoder at the transmitter, the unselfish per-user signal-to-leakage-and-noise ratio (SLNR) criterion is used. On the other hand, the per-user signal-to-interference-and-noise-ratio (SINR) criterion is adopted to determine the decoder of the receiver. The proposed algorithm does not rely on the channel reciprocity and thus is suitable to operate in case of different user transmission powers. Through iterative procedure, we show that the per-user-based criterion which keeps user data streams orthogonal can suppress interference effectively and achieve higher sum rate than the conventional IA algorithms, such as minimum leakage and maximum per-stream SINR algorithms in the multi-user CoMP joint reception scenarios.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125499731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wiring control by RTL design for reconfigurable wave-pipelined circuits","authors":"Tomoaki Sato, S. Chivapreecha, P. Moungnoul","doi":"10.1109/APSIPA.2014.7041673","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041673","url":null,"abstract":"High-speed and low-power circuits of considering the development cycle for digital signal processing are very important in a mobile computing. The achievement of them on an FPGA (Field Programmable Gate Array) dominant in the point of shortening the development cycle. Nevertheless a reconfigurable device such as an FPGA for a power-aware design has not been developed. The authors have developed logic blocks for reconfigurable wave-pipelined circuits for the achievement of high-speed and low-power reconfigurable circuits. Wave-pipeline is one of a circuit design technique for high-speed processing and low-power consumption. They are very useful for the reduction in the resource on the FPGA. However, a wiring control to connect them have not been achieved. In this paper, the wiring control by RTL Design is developed. Its operation speeds are evaluated using 0.18 um CMOS technology.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122991949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic growth model for dendrobium orchid","authors":"Korakoch Kongsombut, R. Chaisricharoen","doi":"10.1109/APSIPA.2014.7041816","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041816","url":null,"abstract":"Dendrobium orchid has several plant states which are requiring different patterns of cultivation. In order to deliver appropriate advice to orchid farmers, status of their farms must be aware especially in the composite of orchid in each state. To model and predict farm status based on given initial data, a growth model is introduced in form of the CDF which can be easily adapted to estimate ratio of status change based on certain amount of plant in each state. The experiment involves around 120 orchid plants divided into four growing states with over one year of observations. The proposed model is confirmed with collected data which is strongly representing normal distribution behavior.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114251670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Hashimoto, B. Chakraborty, S. Aramvith, T. Kuboyama, Y. Shirota
{"title":"Affected people's needs detection after the East Japan Great Earthquake — Time series analysis using LDA","authors":"T. Hashimoto, B. Chakraborty, S. Aramvith, T. Kuboyama, Y. Shirota","doi":"10.1109/APSIPA.2014.7041714","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041714","url":null,"abstract":"After the East Japan Great Earthquake happened on Mar. 11, 2011, many affected people who lost houses, jobs and families fell into difficulties. Governmental agencies and NPOs supported them by offering relief supplies, foods, evacuation centers and temporary houses. When various supports were offered to affected people, if Governmental agencies and NPOs could detect their needs appropriately, it was effective for supporting them. This paper proposes the method to extract affected people's needs from Social Media after the Earthquake and analyze their needs changes over time. We target the blog that expressed thoughts, requirements and complaints of affected people, and adopt the Latent Dirichlet Allocation (LDA) that is one of popular techniques for topic extraction. We then compare the analysis result with affected people's actual situation and real events and evaluate the effectiveness of our method. In addition, we evaluate the effectiveness as the method that can help decision making for providing appropriate supports to affected people.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115983482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced local feature approach for overlapping sound event recognition","authors":"J. Dennis, T. H. Dat","doi":"10.1109/APSIPA.2014.7041646","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041646","url":null,"abstract":"In this paper, we propose a feature-based approach to address the challenging task of recognising overlapping sound events from single channel audio. Our approach is based on our previous work on Local Spectrogram Features (LSFs), where we combined a local spectral representation of the spectrogram with the Generalised Hough Transform (GHT) voting system for recognition. Here we propose to take the output from the GHT and use it as a feature for classification, and demonstrate that such an approach can improve upon the previous knowledge-based scoring system. Experiments are carried out on a challenging set of five overlapping sound events, with the addition of non-stationary background noise and volume change. The results show that the proposed system can achieve a detection rate of 99% and 91% in clean and 0dB noise conditions respectively, which is a strong improvement over our previous work.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116648235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An effective reduction of subproblems in design of CSD coefficient FIR filters","authors":"Takuya Imaizumi, K. Suyama","doi":"10.1109/APSIPA.2014.7041542","DOIUrl":"https://doi.org/10.1109/APSIPA.2014.7041542","url":null,"abstract":"In this paper, an effective reduction method of the number of subproblems for the design of CSD (Canonic Signed Digit) coefficient FIR (Finite Impulse Response) filters using BB (Branch and Bound method) is studied. The problem can be formulated as a mixed integer programming problem. The problem can be optimally solved by using the BB. For solving the problem, it is required to solve a large number of subproblems, and it causes a high computational time. Recently, a novel method for the reduction of the number of subproblems has been proposed. In the method, the subproblems can be reduced by starting from an initial branch tree constructed by an approximate solution obtained by any heuristic method. However, the reduction of subproblems depends on the heuristic method applied. In this paper, the effective methods for the reduction of subproblems are studied. Several examples are shown to present the efficiency of the studied methods.","PeriodicalId":231382,"journal":{"name":"Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116685190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}