Int. J. Comput. Linguistics Chin. Lang. Process.最新文献_第10页

Measuring Relationship among Dialects: DOC and Related Resources 方言关系的测量:DOC与相关资源

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1997-02-01 DOI: 10.30019/IJCLCLP.199702.0002

Chin-Chuan Cheng

引用次数: 33

MAT - A Project to Collect Mandarin Speech Data Through Telephone Net works in Taiwan 利用台湾电话网搜集普通话语音资料的计画

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1997-02-01 DOI: 10.30019/IJCLCLP.199702.0003

Hsiao-Chuan Wang

引用次数: 48

A Model for Robust Chinese Parser 一种鲁棒中文解析器模型

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1996-08-01 DOI: 10.30019/IJCLCLP.199608.0006

Keh-Jiann Chen

{"title":"A Model for Robust Chinese Parser","authors":"Keh-Jiann Chen","doi":"10.30019/IJCLCLP.199608.0006","DOIUrl":"https://doi.org/10.30019/IJCLCLP.199608.0006","url":null,"abstract":"The Chinese language has many special characteristics which are substantially different from western languages, causing conventional methods of language processing to fail on Chinese. For example, Chinese sentences are composed of strings of characters without word boundaries that are marked by spaces. Therefore, word segmentation and unknown word identification techniques must be used in order to identify words in Chinese. In addition, Chinese has very few inflectional or grammatical markers, making purely syntactic approaches to parsing almost impossible. Hence, a unified approach which involves both syntactic and semantic information must be used. Therefore, a lexical feature-based grammar formalism, called Information-based Case Grammar, is adopted for the parsing model proposed here. This grammar formalism stipulates that a lexical entry for a word contains both semantic and syntactic feature structures. By relaxing the constraints on lexical feature structures, even ill-formed input can be accepted, broadening the coverage of the grammar. A model of a priority controlled chart parser is proposed which, in conjunction with a mechanism of dynamic grammar extension, addresses the problems of: (1) syntactic ambiguities, (2) under-specification and limited coverage of grammars, and (3) ill-formed sentences. The model does this without causing inefficient parsing of sentences that do not require relaxation of constraints or dynamic extension of the grammar.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129765226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

An Overview of Corpus-Based Statistics-Oriented(CBSO) Techniques for Natural Language Processing 基于语料库的面向统计(CBSO)自然语言处理技术综述

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1996-08-01 DOI: 10.30019/IJCLCLP.199608.0004

Keh-Yih Su, Tung-Hui Chiang, Jing-Shin Chang

{"title":"An Overview of Corpus-Based Statistics-Oriented(CBSO) Techniques for Natural Language Processing","authors":"Keh-Yih Su, Tung-Hui Chiang, Jing-Shin Chang","doi":"10.30019/IJCLCLP.199608.0004","DOIUrl":"https://doi.org/10.30019/IJCLCLP.199608.0004","url":null,"abstract":"A Corpus-Based Statistics-Oriented (CBSO) methodology, which is an attempt to avoid the drawbacks of traditional rule-based approaches and purely statistical approaches, is introduced in this paper. Rule-based approaches, with rules induced by human experts, had been the dominant paradigm in the natural language processing community. Such approaches, however, suffer from serious difficulties in knowledge acquisition in terms of cost and consistency. Therefore, it is very difficult for such systems to be scaled-up. Statistical methods, with the capability of automatically acquiring knowledge from corpora, are becoming more and more popular, in part, to amend the shortcomings of rule-based approaches. However, most simple statistical models, which adopt almost nothing from existing linguistic knowledge, often result in a large parameter space and, thus, require an unaffordably large training corpus for even well-justified linguistic phenomena. The corpus-based statistics-oriented (CBSO) approach is a compromise between the two extremes of the spectrum for knowledge acquisition. CBSO approach emphasizes use of well-justified linguistic knowledge in developing the underlying language model and application of statistical optimization techniques on top of high level constructs, such as annotated syntax trees, rather than on surface strings, so that only a training corpus of reasonable size is needed for training and long distance dependency between constituents could be handled. In this paper, corpus-based statistics-oriented techniques are reviewed. General techniques applicable to CBSO approaches are introduced. In particular, we shall address the following important issues: (1) general tasks in developing an NLP system; (2) why CBSO is the preferred choice among different strategies; (3) how to achieve good performance systematically using a CBSO approach, and (4) frequently used CBSO techniques. Several examples are also reviewed.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133243666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

A Hybrid Approach to Machine Translation System Design 机器翻译系统设计的混合方法

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1996-08-01 DOI: 10.30019/IJCLCLP.199608.0005

Kuang-hua Chen, Hsin-Hsi Chen

引用次数: 9

A Survey on Automatic Speech Recognition with an Illustrative Example on Continuous Speech Recognition of Mandarin 语音自动识别技术综述——以普通话连续语音识别为例

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1996-08-01 DOI: 10.30019/IJCLCLP.199608.0001

Chin-Hui Lee, B. Juang

引用次数: 16

Important Issues on Chinese Information Retrieval 中文信息检索中的几个重要问题

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 1996-08-01 DOI: 10.30019/IJCLCLP.199608.0007

Lee-Feng Chien, H. Pu

引用次数: 24