Refining the Unrestricted Character Encoding for Japanese

International Conference on Computers and Their Applications Pub Date : 2019-03-13 DOI:10.29007/WSKT

A. Bossard, K. Kaneko

引用次数: 0

Abstract

We have proposed in a previous work an unrestricted character encoding for Japanese (UCEJ). This encoding features an advanced structure, relying on three dimensions, in order to enhance the code usability, easier character lookup being one application. This is in comparison of, for instance, Unicode. In this paper, we propose several important refinements to the UCEJ encoding: first, the addition of the Latin and kana character sets as ubiquitous in Japanese, and second, the inclusion of character stroke order and stroke types into the code and the corresponding binary representation. We estimate the average and worst-case memory complexity of the proposed encoding, and conduct an experiment to measure the required memory size in practice, each time comparing the proposal to conventional encodings.

查看原文本刊更多论文

改进日语的无限制字符编码

我们在之前的工作中提出了一种日语的无限制字符编码(UCEJ)。这种编码具有先进的结构，依赖于三维，以提高代码的可用性，更容易的字符查找是一个应用程序。这是比较，例如，Unicode。在本文中，我们对UCEJ编码提出了几个重要的改进:首先，增加日语中普遍存在的拉丁和假名字符集，其次，将字符笔画顺序和笔画类型包含到代码中并相应的二进制表示。我们估计了所提出的编码的平均和最坏情况下的内存复杂度，并进行了实验来测量实际所需的内存大小，每次都将所提出的编码与传统编码进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Computers and Their Applications

自引率

0.00%

发文量