{"title":"台湾语言档案:语言分析与语言处理","authors":"Elizabeth Zeitoun, Ching-Hua Yu","doi":"10.30019/IJCLCLP.200507.0002","DOIUrl":null,"url":null,"abstract":"In this paper, we deal with the linguistic analysis approach adopted in the Formosan Language Corpora, one of the three main information databases included in the Formosan Language Archive, and the language processing programs that have been built upon it. We first discuss problems related to the transcription of different language corpora. We then deal with annotation rules and standards. We go on to explain the linguistic identification of clauses, sentences and paragraphs, and the computer programs used to obtain an alignment of words, glosses and sentences in Chinese and English. We finally show how we try to cope with analytic inconsistencies through programming. This paper is a complement to Zeitoun et al. [2003] in which we provided an overview of the whole architecture of the Formosan Language Archive.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"The Formosan Language Archive: Linguistic Analysis and Language Processing\",\"authors\":\"Elizabeth Zeitoun, Ching-Hua Yu\",\"doi\":\"10.30019/IJCLCLP.200507.0002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we deal with the linguistic analysis approach adopted in the Formosan Language Corpora, one of the three main information databases included in the Formosan Language Archive, and the language processing programs that have been built upon it. We first discuss problems related to the transcription of different language corpora. We then deal with annotation rules and standards. We go on to explain the linguistic identification of clauses, sentences and paragraphs, and the computer programs used to obtain an alignment of words, glosses and sentences in Chinese and English. We finally show how we try to cope with analytic inconsistencies through programming. This paper is a complement to Zeitoun et al. [2003] in which we provided an overview of the whole architecture of the Formosan Language Archive.\",\"PeriodicalId\":436300,\"journal\":{\"name\":\"Int. J. Comput. Linguistics Chin. Lang. Process.\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Comput. Linguistics Chin. Lang. Process.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30019/IJCLCLP.200507.0002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Linguistics Chin. Lang. Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30019/IJCLCLP.200507.0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Formosan Language Archive: Linguistic Analysis and Language Processing
In this paper, we deal with the linguistic analysis approach adopted in the Formosan Language Corpora, one of the three main information databases included in the Formosan Language Archive, and the language processing programs that have been built upon it. We first discuss problems related to the transcription of different language corpora. We then deal with annotation rules and standards. We go on to explain the linguistic identification of clauses, sentences and paragraphs, and the computer programs used to obtain an alignment of words, glosses and sentences in Chinese and English. We finally show how we try to cope with analytic inconsistencies through programming. This paper is a complement to Zeitoun et al. [2003] in which we provided an overview of the whole architecture of the Formosan Language Archive.