使用UTX-S跨多个系统共享用户字典

Francis Bond, Seiji Okura, Yu Yamamoto, Toshiki Murata, Kiyotaka Uchimoto, Michael Kato, Miwako Shimazu, Tsugiyoshi Suzuki
{"title":"使用UTX-S跨多个系统共享用户字典","authors":"Francis Bond, Seiji Okura, Yu Yamamoto, Toshiki Murata, Kiyotaka Uchimoto, Michael Kato, Miwako Shimazu, Tsugiyoshi Suzuki","doi":"10.1145/1499224.1499247","DOIUrl":null,"url":null,"abstract":"Careful tuning of user-created dictionaries is indispensable when using a machine translation system for computer aided translation. However, there is no widely used standard for user dictionaries in the Japanese/English machine translation market. To address this issue, AAMT (the Asia-Pacific Association for Machine Translation) has established a specification of sharable dictionaries (UTX-S: Universal Terminology eXchange -- Simple), which can be used across different machine translation systems, thus increasing the interoperability of language resources. UTX-S is simpler than existing specifications such as UPF and OLIF. It was explicitly designed to make it easy to (a) add new user dictionaries and (b) share existing user dictionaries. This facilitates rapid user dictionary production and avoids vendor tie in. In this study we describe the UTX-Simple (UTX-S) format, and show that it can be converted to the user dictionary formats for five commercial English-Japanese MT systems. We then present a case study where we (a) convert an on-line glossary to UTX-S, and (b) produce user dictionaries for five different systems, and then exchange them. The results show that the simplified format of UTX-S can be used to rapidly build dictionaries. Further, we confirm that customized user dictionaries are effective across systems, although with a slight loss in quality: on average, user dictionaries improved the translations for 44.8% of translations with the systems they were built for and 37.3% of translations for different systems. In ongoing work, AAMT is using UTX-S as the format in building up a user community for producing, sharing, and accumulating user dictionaries in a sustainable way.","PeriodicalId":201231,"journal":{"name":"Conference of the Association for Machine Translation in the Americas","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Sharing User Dictionaries Across Multiple Systems with UTX-S\",\"authors\":\"Francis Bond, Seiji Okura, Yu Yamamoto, Toshiki Murata, Kiyotaka Uchimoto, Michael Kato, Miwako Shimazu, Tsugiyoshi Suzuki\",\"doi\":\"10.1145/1499224.1499247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Careful tuning of user-created dictionaries is indispensable when using a machine translation system for computer aided translation. However, there is no widely used standard for user dictionaries in the Japanese/English machine translation market. To address this issue, AAMT (the Asia-Pacific Association for Machine Translation) has established a specification of sharable dictionaries (UTX-S: Universal Terminology eXchange -- Simple), which can be used across different machine translation systems, thus increasing the interoperability of language resources. UTX-S is simpler than existing specifications such as UPF and OLIF. It was explicitly designed to make it easy to (a) add new user dictionaries and (b) share existing user dictionaries. This facilitates rapid user dictionary production and avoids vendor tie in. In this study we describe the UTX-Simple (UTX-S) format, and show that it can be converted to the user dictionary formats for five commercial English-Japanese MT systems. We then present a case study where we (a) convert an on-line glossary to UTX-S, and (b) produce user dictionaries for five different systems, and then exchange them. The results show that the simplified format of UTX-S can be used to rapidly build dictionaries. Further, we confirm that customized user dictionaries are effective across systems, although with a slight loss in quality: on average, user dictionaries improved the translations for 44.8% of translations with the systems they were built for and 37.3% of translations for different systems. In ongoing work, AAMT is using UTX-S as the format in building up a user community for producing, sharing, and accumulating user dictionaries in a sustainable way.\",\"PeriodicalId\":201231,\"journal\":{\"name\":\"Conference of the Association for Machine Translation in the Americas\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference of the Association for Machine Translation in the Americas\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1499224.1499247\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference of the Association for Machine Translation in the Americas","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1499224.1499247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在使用机器翻译系统进行计算机辅助翻译时,仔细调整用户创建的字典是必不可少的。然而,在日语/英语机器翻译市场上,并没有一个广泛使用的用户词典标准。为了解决这个问题,AAMT(亚太机器翻译协会)建立了一个可共享字典规范(UTX-S:通用术语交换—简单),它可以跨不同的机器翻译系统使用,从而增加了语言资源的互操作性。UTX-S比现有的规范如UPF和OLIF更简单。它被明确地设计为易于(a)添加新的用户字典和(b)共享现有的用户字典。这有助于快速生成用户字典并避免供应商捆绑。在这项研究中,我们描述了UTX-Simple (UTX-S)格式,并表明它可以转换为五个商业英语-日语MT系统的用户字典格式。然后我们给出一个案例研究,其中我们(a)将在线词汇表转换为UTX-S, (b)为五个不同的系统生成用户字典,然后交换它们。结果表明,简化后的UTX-S格式可用于快速构建词典。此外,我们确认自定义用户字典在不同系统之间是有效的,尽管质量略有下降:平均而言,用户字典在使用它们所构建的系统时提高了44.8%的翻译,在不同系统上提高了37.3%的翻译。在正在进行的工作中,AAMT正在使用UTX-S作为格式,以可持续的方式建立一个用户社区,用于生产、共享和积累用户字典。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Sharing User Dictionaries Across Multiple Systems with UTX-S
Careful tuning of user-created dictionaries is indispensable when using a machine translation system for computer aided translation. However, there is no widely used standard for user dictionaries in the Japanese/English machine translation market. To address this issue, AAMT (the Asia-Pacific Association for Machine Translation) has established a specification of sharable dictionaries (UTX-S: Universal Terminology eXchange -- Simple), which can be used across different machine translation systems, thus increasing the interoperability of language resources. UTX-S is simpler than existing specifications such as UPF and OLIF. It was explicitly designed to make it easy to (a) add new user dictionaries and (b) share existing user dictionaries. This facilitates rapid user dictionary production and avoids vendor tie in. In this study we describe the UTX-Simple (UTX-S) format, and show that it can be converted to the user dictionary formats for five commercial English-Japanese MT systems. We then present a case study where we (a) convert an on-line glossary to UTX-S, and (b) produce user dictionaries for five different systems, and then exchange them. The results show that the simplified format of UTX-S can be used to rapidly build dictionaries. Further, we confirm that customized user dictionaries are effective across systems, although with a slight loss in quality: on average, user dictionaries improved the translations for 44.8% of translations with the systems they were built for and 37.3% of translations for different systems. In ongoing work, AAMT is using UTX-S as the format in building up a user community for producing, sharing, and accumulating user dictionaries in a sustainable way.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信