{"title":"Transcribing Southern Min speech corpora with a Web-Based language learning system","authors":"Jun Cai, J. Feldmar, Y. Laprie, J. Haton","doi":"10.1109/ICALIP.2008.4590181","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590181","url":null,"abstract":"The paper proposes a human-computation-based scheme for transcribing Southern Min speech corpora. The core idea is to implement a Web-based language learning system to collect orthographic and phonetic labels from a large amount of language learners and choose the commonly input labels as the transcriptions of the corpora. It is essentially a technology of distributed knowledge acquisition. Some computer-aided mechanisms are also used to verify the collected transcriptions. The benefit of the scheme is that it makes the transcribing task neither tedious nor costly. No significant budget should be made for transcribing large corpora. The design of a system for transcribing Min Nan speech corpora is described in detail. The application of a prototype version of the system shows that this transcribing scheme is an effective and economical way to generate orthographic and phonetic transcriptions.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115123756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On parametric representations of the modified group delay","authors":"R. Padmanabhan, H. Murthy","doi":"10.1109/ICALIP.2008.4590239","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590239","url":null,"abstract":"The modified group delay (MODGD) is a group delay based representation suited for processing speech signals. The MODGD is parameterized by three entities, alpha, gamma and lifteromega. Typically, optimal values of these parameters have to be determined by experimentation. In this paper, we propose a method to automatically determine an optimal value for lifteromega^, which enables the other two parameters to be set to 1.0. This will reduce the optimisation required to obtain meaningful MODGD values directly from the speech signal.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115377468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Research on the remote control simulation for the Loitering Attack Missile based on the distributed VR technique","authors":"Shengzhi Yuan, Xiaofang Xie, Aidong Liu, Jian Cao","doi":"10.1109/ICALIP.2008.4590143","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590143","url":null,"abstract":"In order to master the remote control technology of the Loitering Attack Missile (LAM), and the application of the data link into the cruise missile, it is very important to develop a remote control simulation for the LAM. The structure of the simulation system was designed on the latform of the High Level Architecture (HLA). The key questions such as the basic control models, the simulation of Data Link , the simulation of track following and scene matching were strictly introduced. In practice, the feasible method - the Creator, Vega and VC++, was applied to the development of the remote control simulation system for the LAM. The method is very useful, not only in practice but in the national defense, which can be used in other applications.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116669519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and implementation of embedded video terminal based on Z228","authors":"Jin Zhou, X. Ye","doi":"10.1109/ICALIP.2008.4590048","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590048","url":null,"abstract":"As many special attentions have been paid to modern teaching methods, the demands of multimedia teaching equipments are growing rapidly. However, most of the multimedia teaching equipments are still at a low level. In order to improve the situation, researching an embedded video terminal for teaching systems has practical significance. In this paper, we introduce an embedded video terminal that supports MPEG-4 and fits for multimedia teaching system. The terminal based on Z228 is able to decode MPEG-4 bit stream up to 30 frames per second at VGA (640times480) resolution. Additionally, as a highlight feature of the design, the implemented terminal is a low price solution in which only one chip is applied to realize complex MPEG-4 decoder.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125190247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A block matching criterion for interframe coding of video","authors":"R. Purwar, N. Prakash, N. Rajpal","doi":"10.1109/ICALIP.2008.4589962","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4589962","url":null,"abstract":"Interframe coding is used for removal of temporal redundancy in video data and motion compensation plays a very significant role in the interframe coding of such data. Motion compensation based on block matching technique generally uses the criterion of either minimum Mean Square Error (MSE)/ Mean Absolute Difference (MAD) value to find the suitable motion vector. Vector Matching Criterion (VMC) is another such method for motion compensation in the literature. In this manuscript, a new matching criterion for block based motion compensation is being proposed and compared with other existing techniques. The experimental results show that the proposed criterion of block matching gives excellent results in comparison to the existing criterion of block matching techniques.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125822983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A subpixel color image registration algorithm using quaternion phase-only correlation","authors":"Wei Feng, Bo Hu, Cheng Yang","doi":"10.1109/ICALIP.2008.4590016","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590016","url":null,"abstract":"In this paper, we extend the conventional phase-only correlation (POC) technique to the quaternion field (QPOC) and propose a subpixel color image registration algorithm based on the QPOC. Due to mathematical limitations, traditional POC-based registration algorithms can only be applied to grayscale images or at most complex images. A color image must be first converted to a grayscale one before performing the POC, during which the chrominance information has been wasted. The proposed algorithm not only can naturally make full use of the luminance as well as the chrominance information in color images, but also can directly estimate the subpixel translational shift from the 2D data array of the QPOC function. Experimental results have shown the effectiveness of our algorithm.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"47 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125905255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A high gray scale TFT-LCD drive system","authors":"Zhijie Tang, Ran Feng, Meihua Xu","doi":"10.1109/ICALIP.2008.4590040","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590040","url":null,"abstract":"This paper describes a new drive method of TFT-LCD for high gray scale display. The method is based on Rotating Ordered dithering algorithm and frame modulating technology. This method can use low-bit gray scale TFT-LCD panel to realize a high gray scale display. This method is applied to a 1280*1024 dots RGB 6-bit TFT-LCD panel which can display 256-level gray scale.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123708924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new regularization method for bi-level image restoration","authors":"Jianjun Zhang, Qin Wang","doi":"10.1109/ICALIP.2008.4589985","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4589985","url":null,"abstract":"Image restoration is an ill-posed inverse problem, which has introduced the regularization method to suppress over-amplification. In this paper, by explicitly using the a priori knowledge for bi-level images, we propose a new regularization method for bi-level image restoration. Unlike the well known Tikhonov regularization method which eventually results in a linear system of equations, the new regularization method leads to a nonlinear optimization problem. This nonlinear optimization problem is solved by using the global Barzilai and Borwein gradient method. Simulation results show that the proposed regularization method is feasible and effective for bi-level image restoration.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123770596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A self-adaptive edge detection method based on LoG algorithm","authors":"Sifeng Wang, Jingxiu Zhao","doi":"10.1109/ICALIP.2008.4590163","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590163","url":null,"abstract":"The capability of edge detection using LoG operator is analyzed in this paper and the paper proposes the deficiency of the LoG operator in the practical application. In order to avoid the defects of the LoG operator, the paper based on the date of experimentation and the Matlab software elicits the relation between the entropy of the gray level co-occurrence matrix and the Gassian space coefficient. Optimum Gassian space coefficient of LoG operator can be self-adaptive acquired based on the entropy of the concrete image. So the methods improved in the paper not only effectively control the most noise of the image, but also locate the edge accurately.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125266117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Compensated sum of absolute difference for fast H.264 inter mode selection","authors":"L. Po, Y. Uddin, Kai Guo, Liping Wang","doi":"10.1109/ICALIP.2008.4590012","DOIUrl":"https://doi.org/10.1109/ICALIP.2008.4590012","url":null,"abstract":"In this paper, a new compensated sum of absolute difference (CSAD) for fast H.264 inter mode selection algorithm is proposed. The main idea is to determine the best inter mode based on CSAD cost instead of the rate-distortion (RD) cost. This approach can avoid most of the computationally intensive processes in the H.264 mode decision. The CSAD could solve the problem of SAD and SATD costs used in mode decision which normally bias to the smaller block size modes. It is because they are normally achieving higher prediction accuracy but consume more bit rate. Experimental results show that the proposed CSAD-based mode decision algorithm can reduce 60% to 68% of the H.264 total encoding time with negligible degradation in the RD performance.","PeriodicalId":175885,"journal":{"name":"2008 International Conference on Audio, Language and Image Processing","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115217076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}