{"title":"DEVELOPING AN ONLINE CORPUS OF FORMOSAN LANGUAGES","authors":"Li-May Sung, L. Su, Fuhui Hsieh, Zhemin Lin","doi":"10.6519/TJL.2008.6(2).4","DOIUrl":null,"url":null,"abstract":"Information technologies have now matured to the point of enabling researchers to create a repository of language resources, especially for those languages facing the crisis of endangerment. The development of an online platform of corpora, made possible by recent advances in data storage, character-encoding and web technology, has profound consequences for the accessibility, quantity, quality and interoperability of linguistic field data. This is of particular significance for Formosan languages in Taiwan, many of which are on the verge of extinction. As a response to the recognition of this burgeoning problem, the key objectives of the establishment of the NTU Corpus of Formosan Languages aim to document and thus preserve valuable linguistic data, as well as relevant ethnological and cultural information. This paper will introduce some of the theoretical bases behind this initiative, as well as the procedures, transcription conventions, database normalization, in-house system and three special features in the creation of this corpus.","PeriodicalId":41000,"journal":{"name":"Taiwan Journal of Linguistics","volume":"6 1","pages":"79-117"},"PeriodicalIF":0.3000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Taiwan Journal of Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.6519/TJL.2008.6(2).4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 4
Abstract
Information technologies have now matured to the point of enabling researchers to create a repository of language resources, especially for those languages facing the crisis of endangerment. The development of an online platform of corpora, made possible by recent advances in data storage, character-encoding and web technology, has profound consequences for the accessibility, quantity, quality and interoperability of linguistic field data. This is of particular significance for Formosan languages in Taiwan, many of which are on the verge of extinction. As a response to the recognition of this burgeoning problem, the key objectives of the establishment of the NTU Corpus of Formosan Languages aim to document and thus preserve valuable linguistic data, as well as relevant ethnological and cultural information. This paper will introduce some of the theoretical bases behind this initiative, as well as the procedures, transcription conventions, database normalization, in-house system and three special features in the creation of this corpus.
期刊介绍:
Taiwan Journal of Linguistics is an international journal dedicated to the publication of research papers in linguistics and welcomes contributions in all areas of the scientific study of language. Contributions may be submitted from all countries and are accepted all year round. The language of publication is English. There are no restrictions on regular submission; however, manuscripts simultaneously submitted to other publications cannot be accepted. TJL adheres to a strict standard of double-blind reviews to minimize biases that might be caused by knowledge of the author’s gender, culture, or standing within the professional community. Once a manuscript is determined as potentially suitable for the journal after an initial screening by the editor, all information that may identify the author is removed, and copies are sent to at least two qualified reviewers. The selection of reviewers is based purely on professional considerations and their identity will be kept strictly confidential by TJL. All feedback from the reviewers, except such comments as may be specifically referred to the attention of the editor, is faithfully relayed to the authors to assist them in improving their work, regardless of whether the paper is to be accepted, accepted upon minor revision, revised and resubmitted, or rejected.