{"title":"缅甸语整理算法的比较","authors":"Yuzana, Khin Mar Lar Tun","doi":"10.1109/ICDIM.2008.4746740","DOIUrl":null,"url":null,"abstract":"Myanmar language has no white spaces and word boundary. There is lack of support in Unicode database application such as collation and searching. Powerful collation strategy has necessitated to the all embracing research in the locality of natural language processing. Consequently, we propose a new collation algorithm MyCollate2 extend from MyCollate1 for Myanmar language. This collation algorithm is based on heuristics chart or table. This method foremost slices the syllables of names and then collates them according to the traditional standard Myanmar language dictionary book order. Propose new heuristics chart can work well not only for syllable segmentation but also for collation of words. This algorithm can collate Myanmar names as well as Myanmar words with complex syllable structure such as Pali, Pali loan styles, subscript styles and kinzi styles. This paper tested with Myanmar name, Pali words from Damma books and dictionary words from dictionary book. The experimental result shows that syllable slicing accuracy get 99.55% compare with others and show slicing performance. Collation accuracy gets 95.88% and is better accuracy than previous collation algorithm MyCollate1.","PeriodicalId":415013,"journal":{"name":"2008 Third International Conference on Digital Information Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A comparison of collation algorithm for Myanmar language\",\"authors\":\"Yuzana, Khin Mar Lar Tun\",\"doi\":\"10.1109/ICDIM.2008.4746740\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Myanmar language has no white spaces and word boundary. There is lack of support in Unicode database application such as collation and searching. Powerful collation strategy has necessitated to the all embracing research in the locality of natural language processing. Consequently, we propose a new collation algorithm MyCollate2 extend from MyCollate1 for Myanmar language. This collation algorithm is based on heuristics chart or table. This method foremost slices the syllables of names and then collates them according to the traditional standard Myanmar language dictionary book order. Propose new heuristics chart can work well not only for syllable segmentation but also for collation of words. This algorithm can collate Myanmar names as well as Myanmar words with complex syllable structure such as Pali, Pali loan styles, subscript styles and kinzi styles. This paper tested with Myanmar name, Pali words from Damma books and dictionary words from dictionary book. The experimental result shows that syllable slicing accuracy get 99.55% compare with others and show slicing performance. Collation accuracy gets 95.88% and is better accuracy than previous collation algorithm MyCollate1.\",\"PeriodicalId\":415013,\"journal\":{\"name\":\"2008 Third International Conference on Digital Information Management\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Third International Conference on Digital Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDIM.2008.4746740\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Third International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2008.4746740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comparison of collation algorithm for Myanmar language
Myanmar language has no white spaces and word boundary. There is lack of support in Unicode database application such as collation and searching. Powerful collation strategy has necessitated to the all embracing research in the locality of natural language processing. Consequently, we propose a new collation algorithm MyCollate2 extend from MyCollate1 for Myanmar language. This collation algorithm is based on heuristics chart or table. This method foremost slices the syllables of names and then collates them according to the traditional standard Myanmar language dictionary book order. Propose new heuristics chart can work well not only for syllable segmentation but also for collation of words. This algorithm can collate Myanmar names as well as Myanmar words with complex syllable structure such as Pali, Pali loan styles, subscript styles and kinzi styles. This paper tested with Myanmar name, Pali words from Damma books and dictionary words from dictionary book. The experimental result shows that syllable slicing accuracy get 99.55% compare with others and show slicing performance. Collation accuracy gets 95.88% and is better accuracy than previous collation algorithm MyCollate1.