Refining the Unrestricted Character Encoding for Japanese

A. Bossard, K. Kaneko
{"title":"Refining the Unrestricted Character Encoding for Japanese","authors":"A. Bossard, K. Kaneko","doi":"10.29007/WSKT","DOIUrl":null,"url":null,"abstract":"We have proposed in a previous work an unrestricted character encoding for Japanese (UCEJ). This encoding features an advanced structure, relying on three dimensions, in order to enhance the code usability, easier character lookup being one application. This is in comparison of, for instance, Unicode. In this paper, we propose several important refinements to the UCEJ encoding: first, the addition of the Latin and kana character sets as ubiquitous in Japanese, and second, the inclusion of character stroke order and stroke types into the code and the corresponding binary representation. We estimate the average and worst-case memory complexity of the proposed encoding, and conduct an experiment to measure the required memory size in practice, each time comparing the proposal to conventional encodings.","PeriodicalId":264035,"journal":{"name":"International Conference on Computers and Their Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Computers and Their Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29007/WSKT","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We have proposed in a previous work an unrestricted character encoding for Japanese (UCEJ). This encoding features an advanced structure, relying on three dimensions, in order to enhance the code usability, easier character lookup being one application. This is in comparison of, for instance, Unicode. In this paper, we propose several important refinements to the UCEJ encoding: first, the addition of the Latin and kana character sets as ubiquitous in Japanese, and second, the inclusion of character stroke order and stroke types into the code and the corresponding binary representation. We estimate the average and worst-case memory complexity of the proposed encoding, and conduct an experiment to measure the required memory size in practice, each time comparing the proposal to conventional encodings.
改进日语的无限制字符编码
我们在之前的工作中提出了一种日语的无限制字符编码(UCEJ)。这种编码具有先进的结构,依赖于三维,以提高代码的可用性,更容易的字符查找是一个应用程序。这是比较,例如,Unicode。在本文中,我们对UCEJ编码提出了几个重要的改进:首先,增加日语中普遍存在的拉丁和假名字符集,其次,将字符笔画顺序和笔画类型包含到代码中并相应的二进制表示。我们估计了所提出的编码的平均和最坏情况下的内存复杂度,并进行了实验来测量实际所需的内存大小,每次都将所提出的编码与传统编码进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信