{"title":"An HNM Based Scheme for Synthesizing Mandarin Syllable Signal","authors":"H. Gu, Yan-Zuo Zhou","doi":"10.30019/IJCLCLP.200809.0004","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200809.0004","url":null,"abstract":"In this paper, an HNM based scheme is developed to synthesize Mandarin syllable signals. With this scheme, a Mandarin syllable can be recorded just once, and diverse prosodic characteristics can be synthesized for it without suffering significant signal-quality degradation. In our scheme, a synthetic syllable's duration is subdivided to its comprising phonemes and a piece-wise linear mapping function is constructed. With this mapping function, a control point on a synthetic syllable can be mapped to locate its corresponding analysis frames. Then, the analysis frames' HNM parameters are interpolated to obtain the HNM parameters for the control point. Furthermore, for pitch-height adjusting, another timbre-preserving interpolation is performed on the HNM parameters of a control point. Thereafter, signal samples are synthesized according to the HNM synthesis equations rewritten here. This HNM based scheme has been programmed to synthesize Mandarin speech. According to the perception tests, our HNM based scheme is found to be apparently better than a PSOLA based scheme in signal clarity, i.e. much clearer and no reverberation.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128001694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Question Analysis and Answer Passage Retrieval for Opinion Question Answering Systems","authors":"Lun-Wei Ku, Yu-Ting Liang, Hsin-Hsi Chen","doi":"10.30019/IJCLCLP.200809.0003","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200809.0003","url":null,"abstract":"Question answering systems provide an elegant way for people to access an underlying knowledge base. However, people are interested in not only factual questions, but also opinions. This paper deals with question analysis and answer passage retrieval in opinion QA systems. For question analysis, six opinion question types are defined. A two-layered framework utilizing two question type classifiers is proposed. Algorithms for these two classifiers are described. The performance achieves 87.8% in general question classification and 92.5% in opinion question classification. The question focus is detected to form a query for the information retrieval system and the question polarity is detected to retain relevant sentences which have the same polarity as the question. For answer passage retrieval, three components are introduced. Relevant sentences retrieved are further identified as to whether the focus (Focus Detection) is in a scope of opinion (Opinion Scope Identification) or not, and, if yes, whether the polarity of the scope and the polarity of the question (Polarity Detection) match with each other. The best model achieves an F-measure of 40.59% by adopting partial match for relevance detection at the level of meaningful unit. With relevance issues removed, the F-measure of the best model boosts up to 84.96%.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125104025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Language Information Retrieval Approach to Writing Assistance","authors":"Jyishane Liu, Pei-Chun Hung, Ching-Ying Lee","doi":"10.30019/IJCLCLP.200809.0002","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200809.0002","url":null,"abstract":"We observe that current language resource tools only provide limited help for ESL/EFL writers with insufficient language knowledge. In particular, there is no convenient way for ESL/EFL writers to look for answers to the frequent questions of correct and appropriate language use. We have developed a language information retrieval method to exploit corporal resources and provide effective referential utility for ESL/EFL writing. This method involves the sequential operation of three modules, an expression element module, a retrieval module, and a ranking module. The primary design purpose is to allow flexible and easy transformation from questions to queries and to find relevant examples so that uncertainty of language use can be quickly resolved. We implemented the method and developed a prototype system called SAW (Sentence Assistance for Writing). Simulated language use problems were tested on SAW to evaluate the system’s referential utility. Experimental results indicate that the proposed language information retrieval method is effective in providing help to ESL/EFL writers.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127539747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chung-Chi Huang, Kate H. Kao, Chiung-Hui Tseng, Jason J. S. Chang
{"title":"A Thesaurus-Based Semantic Classification of English Collocations","authors":"Chung-Chi Huang, Kate H. Kao, Chiung-Hui Tseng, Jason J. S. Chang","doi":"10.30019/IJCLCLP.200909.0002","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200909.0002","url":null,"abstract":"Researchers have developed many computational tools aimed at extracting collocations for both second language learners and lexicographers. Unfortunately, the tremendously large number of collocates returned by these tools usually overwhelms language learners. In this paper, we introduce a thesaurus-based semantic classification model that automatically learns semantic relations for classifying adjective-noun (A-N) and verb-noun (V-N) collocations into different thesaurus categories. Our model is based on iterative random walking over a weighted graph derived from an integrated knowledge source of word senses in WordNet and semantic categories of a thesaurus for collocation classification. We conduct an experiment on a set of collocations whose collocates involve varying levels of abstractness in the collocation usage box of Macmillan English Dictionary. Experimental evaluation with a collection of 150 multiple-choice questions commonly used as a similarity benchmark in the TOEFL synonym test shows that a thesaurus structure is successfully imposed to help enhance collocation production for L2 learners. As a result, our methodology may improve the effectiveness of state-of-the-art collocation reference tools concerning the aspects of language understanding and learning, as well as lexicography.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131163525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Cross-Linguistic Study of Voice Onset Time in Stop Consonant Productions","authors":"K. Chao, Li-Mei Chen","doi":"10.30019/IJCLCLP.200806.0005","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200806.0005","url":null,"abstract":"This study examines voice onset time (VOT) for phonetically voiceless word-initial stops in Mandarin Chinese and in English, as spoken by 11 Mandarin speakers and 4 British English speakers. The purpose of this paper is to compare Mandarin and English VOT patterns and to categorize their stop realizations along the VOT continuum. As expected, the findings reveal that voiceless aspirated stops in Mandarin and in English occur at different places along the VOT continuum and the differences reach significance. The results also suggest that the three universal VOT categories (i.e. long lead, short lag, and long lag) are not fine enough to distinguish the voiceless stops of these two languages.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125126604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Document Summarization Using Principal Component Analysis Incorporating Semantic Vector Space Model","authors":"O. Vikas, A. Meshram, Girraj Meena, Amit Gupta","doi":"10.30019/IJCLCLP.200806.0001","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200806.0001","url":null,"abstract":"Text Summarization is very effective in relevant assessment tasks. The Multiple Document Summarizer presents a novel approach to select sentences from documents according to several heuristic features. Summaries are generated modeling the set of documents as Semantic Vector Space Model (SVSM) and applying Principal Component Analysis (PCA) to extract topic features. Pure Statistical VSM assumes terms to be independent of each other and may result in inconsistent results. Vector space is enhanced semantically by modifying the weight of the word vector governed by Appearance and Disappearance (Action class) words. The knowledge base for Action words is maintained by classifying the words as Appearance or Disappearance with the help of Wordnet. The weights of the action words are modified in accordance with the Object list prepared by the collection of nouns corresponding to the action words. Summary thus generated provides more informative content as semantics of natural language has been taken into consideration.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128655306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Effects of Formal Schema on Reading Comprehension¡XAn Experiment with Chinese EFL Readers","authors":"Xiaoyan Zhang","doi":"10.30019/IJCLCLP.200806.0004","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200806.0004","url":null,"abstract":"This study attempts to explore the effects of formal schemata or rhetorical patterns on reading comprehension through detailed analysis of a case study of 45 non-English majors from X University. The subjects were selected from three classes of comparable English level and were divided into three groups. Each group was asked to recall the text and finish a cloze test after reading one of three versions of a passage with identical content but different formal schemata: description schema, comparison and contrast schema, and problem-solution schema. Both quantitative and qualitative analyses of the recall protocol indicate that subjects displayed better recall of the text with highly structured schema than the one with loosely controlled schema, which suggests that formal schemata has a significant effect on written communication and the teaching of formal schemata to students is necessary to enhance their writing ability.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131533597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study on Consistency Checking Method of Part-Of-Speech Tagging for Chinese Corpora","authors":"Hu Zhang, Jia-heng Zheng","doi":"10.30019/IJCLCLP.200806.0002","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200806.0002","url":null,"abstract":"Ensuring consistency of Part-Of-Speech (POS) tagging plays an important role in the construction of high-quality Chinese corpora. After having analyzed the POS tagging of multi-category words in large-scale corpora, we propose a novel classification-based consistency checking method of POS tagging in this paper. Our method builds a vector model of the context of multi-category words along with using the k-NN algorithm to classify context vectors constructed from POS tagging sequences and to judge their consistency. These methods are evaluated on our 1.5M-word corpus. The experimental results indicate that the proposed method is feasible and effective.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131279154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Shallow Answer Ranking Features in Cross-Lingual and Monolingual Factoid Question Answering","authors":"Cheng-Wei Lee, Yi-Hsun Lee, W. Hsu","doi":"10.30019/IJCLCLP.200803.0001","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200803.0001","url":null,"abstract":"Answer ranking is critical to a QA (Question Answering) system because it determines the final system performance. In this paper, we explore the behavior of shallow ranking features under different conditions. The features are easy to implement and are also suitable when complex NLP techniques or resources are not available for monolingual or cross-lingual tasks. We analyze six shallow ranking features, namely, SCO-QAT, keyword overlap, density, IR score, mutual information score, and answer frequency. SCO-QAT (Sum of Co-occurrence of Question and Answer Terms) is a new feature proposed by us that performed well in NTCIR CLQA. It is a co-occurrence based feature that does not need extra knowledge, word-ignoring heuristic rules, or special tools. Instead, for the whole corpus, SCO-QAT calculates co-occurrence scores based solely on the passage retrieval results. Our experiments show that there is no perfect shallow ranking feature for every condition. SCO-QAT performs the best in C-C (Chinese-Chinese) QA, but it is not a good choice in E-C (English-Chinese) QA. Overall, Frequency is the best choice for E-C QA, but its performance is impaired when translation noise is present. We also found that passage depth has little impact on shallow ranking features, and that a proper answer filter with fined-grained answer types is important for E-C QA. We measured the performance of answer ranking in terms of a newly proposed metric EAA (Expected Answer Accuracy) to cope with cases of answers that have the same score after ranking.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122382394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification","authors":"Nengheng Zheng, Tan Lee, Ning Wang, P. Ching","doi":"10.30019/IJCLCLP.200709.0004","DOIUrl":"https://doi.org/10.30019/IJCLCLP.200709.0004","url":null,"abstract":"This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral features aim at characterizing the formant structure of the vocal tract system. This study proposes a new feature set, named the wavelet octave coefficients of residues (WOCOR), to characterize the vocal source excitation signal. WOCOR is derived by wavelet transformation of the linear predictive (LP) residual signal and is capable of capturing the spectro-temporal properties of vocal source excitation. WOCOR and MFCC contain complementary information for speaker recognition since they characterize two physiologically distinct components of speech production. The complementary contributions of MFCC and WOCOR in speaker identification are investigated. A confidence measure based score-level fusion technique is proposed to take full advantage of these two complementary features for speaker identification. Experiments show that an identification system using both MFCC and WOCOR significantly outperforms one using MFCC only. In comparison with the identification error rate of 6.8% obtained with MFCC-based system, an error rate of 4.1% is obtained with the proposed confidence measure based integrating system.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130093205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}