Sarinya Chompoobutr, M. Boriboon, Wantanee Phantachat, Puttachart Potibal
{"title":"Core vocabulary of Thai language for Thai picture based communication system","authors":"Sarinya Chompoobutr, M. Boriboon, Wantanee Phantachat, Puttachart Potibal","doi":"10.1145/1592700.1592736","DOIUrl":null,"url":null,"abstract":"This paper demonstrates how to obtain core vocabulary in Thai. They were collected from writing languages across four sources: BEST corpus (2009), Thai dictionary of the Royal Institute of Thailand: RI (1982), Lexicon of preschool and elementary student (1988) and \"Khlang Kham\" of Nawawan Phanthumetha (2001). The total corpora were analyzed for core vocabulary of Thai language. The results indicate that the first 100 words, core vocabulary accounting for 49.92 per cent of the corpora. Almost of them can play two or more parts of speech, depending on their position and context in sentences.","PeriodicalId":241320,"journal":{"name":"International Convention on Rehabilitation Engineering & Assistive Technology","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Convention on Rehabilitation Engineering & Assistive Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1592700.1592736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
This paper demonstrates how to obtain core vocabulary in Thai. They were collected from writing languages across four sources: BEST corpus (2009), Thai dictionary of the Royal Institute of Thailand: RI (1982), Lexicon of preschool and elementary student (1988) and "Khlang Kham" of Nawawan Phanthumetha (2001). The total corpora were analyzed for core vocabulary of Thai language. The results indicate that the first 100 words, core vocabulary accounting for 49.92 per cent of the corpora. Almost of them can play two or more parts of speech, depending on their position and context in sentences.