Guang-yan Wang, Yi-ming Zhang, Mei-Lin Sun, Xia Wang, Yan Zhang
{"title":"Speech signal feature parameters extraction algorithm based on PCNN for isolated word recognition","authors":"Guang-yan Wang, Yi-ming Zhang, Mei-Lin Sun, Xia Wang, Yan Zhang","doi":"10.1109/ICALIP.2016.7846618","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846618","url":null,"abstract":"In the isolated word speech recognition system, the extraction and matching of the characteristic parameters is the key link. This paper introduces a new feature parameter extracting methods on basis of Pulse Coupled Neural Network (PCNN) for the recognition system. By means of the visibility of speech spectrogram, the PCNN is used to extract the time series and entropy series from the spectrogram of words. Finally, by means of DTW algorithm to accomplish the task of isolated word recognition, the simulation results demonstrate the feasibility and effectiveness of the proposed algorithm.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130364348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research and design of the future classroom based on big data and cloud processing","authors":"Lijuan Zhu","doi":"10.1109/ICALIP.2016.7846597","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846597","url":null,"abstract":"With the rapid development and application of the IOT (internet of things), cloud processing and big data, it has been a hot issue to research and design the future classroom in the educational research field. About the future classroom, this paper analyzes its necessity and importance, and then presents its concept of design, its structure of hardware and key elements of establishment, and finally introduces in detail about how to make use of big data and cloud processing to meet the future classroom's demands.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123674483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on image retrieval technology based on image fingerprint and color features","authors":"Yaguang Wang, Aina Sui, Wenlong Fu","doi":"10.1109/ICALIP.2016.7846636","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846636","url":null,"abstract":"Content based video retrieval and content-based image retrieval are the hot research topics in recent years. Image feature extraction has played a very important role in the retrieval process. In this paper, we use the image color features and the image fingerprint extracted by the improved perceptual hash algorithm. In order to combine these two features, we have done a lot of tests to find the optimal weight values. We used these to have image retrieval experiments on Corel set and analyzed the results. Experiments show that the improved algorithm has higher retrieval efficiency and the precision of retrieval result is improved.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"13 87","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114088649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive depth map-based retinex for image defogging","authors":"Jun Liu, Jinxiu Zhu, Y. Pei, Yao Zhang","doi":"10.1109/ICALIP.2016.7846593","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846593","url":null,"abstract":"Image defogging technology has attracted a lot of interest in the field of image processing. However, the structure characteristics of the fog images are rarely considered in the state-of-the-art defogging algorithms. To overcome this weakness, this paper proposes an adaptive retinex defogging method based on depth map for structure-complex fog images. First, based on the thickness of each scene, K-means algorithm is adopted to cluster image into several patches with similar structure characteristics. Then, for each patch, an adaptive single scale retinex model is built, which joints the mean depth of scenes in each patch and the retinex theory. Simulation results show that the proposed method offers comparable defogging performance to the conventional DCP and MSRCR methods, especially for the degraded images with a complex structure.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127971848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jichen Yang, Qianhua He, Min Cai, Yanxiong Li, Hai Jin
{"title":"Construction of bottle-body autoencoder and its application to audio signal classification","authors":"Jichen Yang, Qianhua He, Min Cai, Yanxiong Li, Hai Jin","doi":"10.1109/ICALIP.2016.7846541","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846541","url":null,"abstract":"In order to extract effective audio feature using autoencoder, different from traditional bottle-neck autoencoder, bottle-body autoencoder is presented in this paper, which is constructed using restricted Boltzmann machine with the same neurons at every layer. Bottle-body feature, which is obtained by using pseudo-inverse method to initialize weights, is applied to audio signal classification. The proposed approach is evaluated on the BBC Sound Effects Library, and shows a 14.90% and 16.20% improvement on classification accuracy than traditional Mel-frequency cepstral coefficient and bottle-neck feature.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132480740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-rigid point set registration with multiple features","authors":"H. Tang, Yang Yang","doi":"10.1109/ICALIP.2016.7846559","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846559","url":null,"abstract":"We present a new method for non-rigid registration with multiple features in this work. The proposed method is based on an alternating two-step process: correspondence estimation and transformation updating. We first define two vector features for measuring global and local structural differences between two point sets, respectively. We then combine the two features to build a multi-feature based energy function which provides a novel way to estimate correspondences by minimizing global or local feature differences using a linear assignment solution. To enhance the interaction between the two steps, we design an annealing scheme to gradually change the energy minimization from local to global feature differences and the thin plate spline transformation from rigid to non-rigid during the registration process. The registration results demonstrate that our method outperforms four state-of-the-art methods in most experiments.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128829837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Isolated Chinese lyrics with accompaniment recognition based on SVM","authors":"Juanjuan Cai, Na Li, Hui Wang, Bin Zhu","doi":"10.1109/ICALIP.2016.7846536","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846536","url":null,"abstract":"The speech recognition technology is one of the hot spots in the field of audio technology. For the recognition of the lyrics with the accompaniment, there are two commonly used methods, one is applying automatic speech recognition technology to singing recognition, the other way is using sound classification, extracting audio features, and then using pattern matching classifier for classification. In this paper, we use sound classification method, adopt self-built experimental database where 31 classes Chinese isolated lyrics (Total 4650) are intercepted from different songs. And then use these words as the units. Considering speaking and singing sharing similar mechanism, we extract 39-dimensional MFCC feature parameters which are widely used in speech recognition. Combined with training materials, adjust kernel parameters and choose functions to train SVM classifier. After that, the trained SVM classification system is used to recognize the lyrics, and the average recognition accuracy rate is 42.80%.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122118917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Require- documents and provide-documents matching algorithm based on topic model","authors":"Xiang Zou, Yue Wu, Zhongtian Liu","doi":"10.1109/ICALIP.2016.7846639","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846639","url":null,"abstract":"The relationship between internet and human life is becoming much more close than ever before. At the same time, Modern enterprise cooperate with each other in the form of technical collaboration is becoming a common mode of production and will be a crucial mode in the future. With the rapid development and popularization of internet, the scope of collaboration is more and more wide, such as across the city, or even across the whole world are turning into reality. This paper gathered a set number of require-documents and provide-documents through a project. Due to the special content structure of require-documents and provide-documents, a require-document may be matched by some provide-documents, how to find a matching provide-documents for a require-document quickly and accurately and how to match a require-document for a provide-document are becoming the biggest problem. This paper proposes a require-documents and provide-documents matching algorithm based on topic model, and the algorithm relies on the text similarity. Finally, in this paper, the effect of the algorithm is verified.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116997698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel clustering by fast search and find of density peaks","authors":"Ji Chengheng, Lei Yongmei","doi":"10.1109/ICALIP.2016.7846664","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846664","url":null,"abstract":"The algorithm clustering by fast search and find of density peaks shows good efficiency and accuracy, but the space complexity of the algorithm is too high since it has to keep a global distance matrix in memory, so it can hardly process big dataset clustering. To solve this problem, this paper designed a new strategy for the algorithm to search the important quantity δ, by using the new strategy, the space complexity of the algorithm is greatly reduced. And based on that reduction, a corresponding load balanced parallel clustering algorithm was presented in this paper, experimental results show that the parallel algorithm is efficient and scalable.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124780630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and pre-warning of vehicle lane change based on state machine","authors":"Xinnan Hu, Xing Zhang, Yougang Min, Xinghua Yao, Fei Wu, Juan Zhang","doi":"10.1109/ICALIP.2016.7846623","DOIUrl":"https://doi.org/10.1109/ICALIP.2016.7846623","url":null,"abstract":"This paper focuses on the problem of detecting driver's changing lane, and gives a state machine model to analyze driving behavior. Firstly, three-axis accelerated velocity sensor is used to gather data, and then smoothing filtering method is made to process data. Secondly, a change rate of accelerated velocity is computed by using linear regression method, and vehicle's turning range and accelerationdeceleration are analyzed. Thirdly, vehicle's running state is fuzzily processed and a state machine is built to detect vehicle running. Experimental results show that the state model could correctly detect driver's driving behavior and send warning signals.","PeriodicalId":184170,"journal":{"name":"2016 International Conference on Audio, Language and Image Processing (ICALIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129630054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}