{"title":"A method of single channel blind source separation of co-frequency 16QAM signals","authors":"Zhu Wengui, Zhang Yu-ren","doi":"10.1109/ICALIP.2016.7846587","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846587","url":null,"abstract":"With the rapid development of wireless communication, the receiver equipped one antenna receives more than one signal at the same frequency band simultaneously. For the above-mentioned reasons, the single channel blind source separation (BSS) becomes a hot topic in signal processing field. Nevertheless, the study of single channel blind source separation have great challenges since this problem is usually ill-posed one. Many scholars make hard efforts to solve this ill-posed problem. In this paper, one method is proposed to separate two co-frequency16QAM overlapped signals based maximum likelihood method (ML) since communication signals have finite set of symbols and some preamble symbols we already know. The performance of the method through computer simulations shows its reliability.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"275 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125144784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Six dimensional clustering segmentation of color point cloud","authors":"Z. Ximin, Wan Wanggen","doi":"10.1109/ICALIP.2016.7846670","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846670","url":null,"abstract":"This paper focuses on the clustering segmentation of 3D color point cloud. We extend the mean shift algorithm to the 3D xyz space, and what's more, we also consider the rgb color information, so the 6 dimensional data is adopted in the algorithm. The cluster center converges to the joint position of the local maximum density and the minimum gradient change of color, so our clustering segmentation not only considers the local geometrical features, but also utilizes the color information. The experiments show that our segmentation has better region consistency and has clear segmenting border in different color neighbors.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130641784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Music boundary detection with multiple features","authors":"Weiyao Xue, Shutao Sun, Fengyan Wu, Yongbin Wang","doi":"10.1109/ICALIP.2016.7846614","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846614","url":null,"abstract":"Music structural analysis tasks have an important position in the field of Music information retrieval which require an understanding of how humans process music internally, such as music indexing, music summarization, and similarity analysis. Many schemes have been proposed to analyze the structure of recorded music, however they usually use single feature to detect boundaries of songs and the results are not satisfactory. In this paper, we present a method which is based on novelty detection and combines multiple features to the task of music boundaries detection. We extract peaks of novelty function derived from various features as potential boundaries, then eliminate non-boundaries from potential boundaries derived from distinct feature sets. Three types of features, including intensity, timbre, and harmony are employed to represent the characteristics of a music clip. On our testing database composed of 175 entire songs, the best accuracy of boundary detection with tolerance ±3 seconds achieves up to 65.7%.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133442395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scene recognition algorithm based on multi-feature and weighted minimum distance classifier for digital hearing aids","authors":"Ru-wei Li, Shuang Zhang, Xiaoqun Yi","doi":"10.1109/ICALIP.2016.7846557","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846557","url":null,"abstract":"The recognition precision of the existing auditory scene recognition algorithms is relatively satisfactory, but they can only be applied to several noise scenarios, and it can't meet the performance requirements of digital hearing aids in complex environment. In order to solve the above problems, scene recognition algorithm based on multi-feature and weighted minimum distance classifier is proposed in this paper. In this algorithm, the speech endpoint detection algorithm based on the band-partitioning spectral entropy and spectral energy is used to divide the noisy speech into speech segment and noise segment. Then the characteristics such as Critical Band Ratio and band-partitioning spectral entropy as well as adaptive short-time zero crossing rate of each segment are extracted for the weighted minimum distance classifier to recognize the noise scenario. The experiments result shows that the proposed algorithm has strong robustness and high accuracy. It's suitable to be applied in digital hearing aids.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"42 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114101051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A precise evaluation method of prosodic quality of non-native speakers using average voice and prosody substitution","authors":"Hafiyan Prafianto, Takashi Nose, A. Ito","doi":"10.1109/ICALIP.2016.7846620","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846620","url":null,"abstract":"We propose a method to improve the consistency of human evaluation of non-native speaker's utterance, with a capability to evaluate features such as accent and rhythm. In this method, human evaluators evaluate the accent and the rhythm independently by using average voice model and prosody substitution. We also investigated the advantages of evaluating those features independently. We found that, when the prosodic features are not evaluated independently, the accent scores are affected by the goodness of the rhythm and vice versa. The correlation coefficient of the accent score and the rhythm score of identical utterances was 0.23 using the conventional method and −0.026 using the proposed method. This also leads to greater disagreement between the scores given by different evaluators. Using the conventional method, 23% of the pairs between evaluators have their inter-evaluator correlation of the rhythm score more than 0.5, while using this proposed method, 67% of the pairs have the inter-evaluator correlation more than 0.5.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123926467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Movie audio scene recognition based on WFST","authors":"Jichen Yang, Min Cai, Yanxiong Li, Hai Jin","doi":"10.1109/ICALIP.2016.7846543","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846543","url":null,"abstract":"In order to improve movie audio scene (MAS) recognition accuracy, weighted finite-state transducer (WFST) is proposed to recognize MAS in this paper. WFST is introduced firstly, how to construct WFST is introduced secondly, WFST is used to recognize MAS using FBANK, MFCC and PLPCC, separately. The experimental results on twenty MASs using the three features shows that WFST can recognize MAS well, FBANK feature performs better than MFCC and PLPCC, which can reach 79.9%.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122394428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computer virtual reconstruction of a three dimensional scene in integral imaging","authors":"Min Guo, Yu-juan Si, Shigang Wang, Yuan-zhi Lyu, Bowen Jia, Wei Wu","doi":"10.1109/ICALIP.2016.7846529","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846529","url":null,"abstract":"To solve the problem of rapid perception of the collected content and non-contact measurement in integral imaging, computer virtual reconstruction algorithm of a three dimensional scene is proposed. Firstly, calculate the combined disparity map of an elemental image array using the region-based iterative matching algorithm according to the distribution characteristics of the homologous pixels in the elemental image array; then calculate the spatial coordinates of the reconstructed object points in line with the triangulation principle; and at last, delete error points and reduce the data redundancy using the data simplification algorithm based on the scanning line, and then the reconstructed three dimensional scene is obtained. The experimental results indicate that the method can not only reconstruct a clear and complete three dimensional scene, restore the relative position of the objects, but also measure the objects' sizes in the three dimensional scene.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124857862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supervised Feature Learning Network Based on the Improved LLE for face recognition","authors":"Dan Meng, Guitao Cao, W. Cao, Zhihai He","doi":"10.1109/ICALIP.2016.7846591","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846591","url":null,"abstract":"Deep neural networks (DNNs) have been successfully applied in the fields of computer vision and pattern recognition. One drawback of DNNs is that most of existing DNNs models and their variants usually need to learn a very large set of parameters. Another drawback of DNNs is that DNNs does not fully take the class label and local structure into account during the training stage. To address these issues, this paper proposes a novel approach, called Supervised Feature Learning Network Based on the Improved LLE (SFLNet) for face recognition. The goal of SFLNet is to extract features efficiently. Thus SFLNet consists of learning kernels based on the improved Locally Linear Embedding (LLE) and multiscale feature analysis. Instead of taking image pixels as the input of LLE algorithm, the improved LLE uses linear discriminant kernel distance (LDKD). Besides, the outputs of the improved LLE are convolutional kernels, not the dimensional reduction features. Mutiscale feature analysis enhances the insensitive to complex changes caused by large pose, expression, or illumination variations. So SFLNet has better discrimination and is more suitable for face recognition task. Experimental results on Extended Yale B and AR dataset shows the impressive improvement of the proposed method and robustness to occlusion when compared with other state-of-art methods.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122143732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrupted old film sequences restoration using improved PatchMatch and low-rank matrix recovery","authors":"Ting Yu, Youdong Ding, Xi Huang, Bing Wu","doi":"10.1109/ICALIP.2016.7846589","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846589","url":null,"abstract":"Taking the corrupted old film as the research object, this paper proposed a new video restoration method based on improved PatchMatch and low-rank matrix recovery. Our method is divided into three steps. Firstly, we divide each frame in the video sequence into image patches with overlap region, and similar interframe patches are found using the proposed improved PatchMatch algorithm. Then, the low-rank matrix recovery is used to separate the patch group into low-rank matrix component and sparse error component. Finally, synthesizing the video frame by the recorded location of patches, and completing the multi-frame joint automatic restoration frame by frame. The proposed method has been tested on a set of old film sequences in this paper. Experiment demonstrates that it is an effective method for corrupted old film restoration.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128965589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Progressive compression and transmission of 3D model with WebGL","authors":"Pengfei Li, Xiaoqing Yu, Jingjing Wang","doi":"10.1109/ICALIP.2016.7846665","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846665","url":null,"abstract":"This paper presents a system of progressive compression and transmission of 3D model with WebGL. The algorithm based on edge collapse is chosen in this system to complete the compression work and is modified in order to adapt the system. WebGL is used in the client part of the system so the system could be multi-platform as long as the browser supports the HTML5 on that platform, like chrome, firefox and so on. These browsers supported almost all the platform. With the progressive compression method running on the server side and WebGL technology running on the browser, people can get the view of the 3D models in much more quickly even on the smartphone as well as the quality of the models still can be accepted.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128742875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}