Anusaaraka: An expert system based machine translation system

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010) Pub Date : 2010-09-30 DOI:10.1109/NLPKE.2010.5587789

Sriram Chaudhury, A. Rao, D. Sharma

{"title":"Anusaaraka: An expert system based machine translation system","authors":"Sriram Chaudhury, A. Rao, D. Sharma","doi":"10.1109/NLPKE.2010.5587789","DOIUrl":null,"url":null,"abstract":"Most research in Machine translation is about having the computers completely bear the load of translating one human language into another. This paper looks at the machine translation problem afresh and observes that there is a need to share the load between man and machine, distinguish reliable knowledge from the heuristics, provide a spectrum of outputs to serve different strata of people, and finally make use of existing resources instead of reinventing the wheel. This paper describes a unique approach to develop machine translation system based on the insights of information dynamics from Paninian Grammar Formalism. Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader. The system promises to give faithful representation of the translated text, no loss of information while translating and graceful degradation (robustness) in case of failure. The layered output provides an access to all the stages of translation making the whole process transparent. Thus, Anusaaraka differs from the Machine Translation systems in two respects: (1) its commitment to faithfulness and thereby providing a layer of 100% faithful output so that a user with some training can “access the source text” faithfully. (2) The system is so designed that a user can contribute to it and participate in improving its quality. Further Anusaaraka provides an eclectic combination of the Apertium architecture with the forward chaining expert system, allowing use of both the deep parser and shallow parser outputs to analyze the SL text. Existing language resources (parsers, taggers, chunkers) available under GPL are used instead of rewriting it again. Language data and linguistic rules are independent from the core programme, making it easy for linguists to modify and experiment with different language phenomena to improve the system. Users can become contributors by contributing new word sense disambiguation (WSD) rules of the ambiguous words through a web-interface available over internet. The system uses forward chaining of expert system to infer new language facts from the existing language data. It helps to solve the complex behavior of language translation by applying specific knowledge rather than specific technique creating a vast language knowledge base in electronic form. Or in other words, the expert system facilitates the transformation of subject matter expert's (SME) knowledge available with humans into a computer processable knowledge base.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NLPKE.2010.5587789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Most research in Machine translation is about having the computers completely bear the load of translating one human language into another. This paper looks at the machine translation problem afresh and observes that there is a need to share the load between man and machine, distinguish reliable knowledge from the heuristics, provide a spectrum of outputs to serve different strata of people, and finally make use of existing resources instead of reinventing the wheel. This paper describes a unique approach to develop machine translation system based on the insights of information dynamics from Paninian Grammar Formalism. Anusaaraka is a Language Accessor cum Machine Translation system based on the fundamental premise of sharing the load producing good enough results according to the needs of the reader. The system promises to give faithful representation of the translated text, no loss of information while translating and graceful degradation (robustness) in case of failure. The layered output provides an access to all the stages of translation making the whole process transparent. Thus, Anusaaraka differs from the Machine Translation systems in two respects: (1) its commitment to faithfulness and thereby providing a layer of 100% faithful output so that a user with some training can “access the source text” faithfully. (2) The system is so designed that a user can contribute to it and participate in improving its quality. Further Anusaaraka provides an eclectic combination of the Apertium architecture with the forward chaining expert system, allowing use of both the deep parser and shallow parser outputs to analyze the SL text. Existing language resources (parsers, taggers, chunkers) available under GPL are used instead of rewriting it again. Language data and linguistic rules are independent from the core programme, making it easy for linguists to modify and experiment with different language phenomena to improve the system. Users can become contributors by contributing new word sense disambiguation (WSD) rules of the ambiguous words through a web-interface available over internet. The system uses forward chaining of expert system to infer new language facts from the existing language data. It helps to solve the complex behavior of language translation by applying specific knowledge rather than specific technique creating a vast language knowledge base in electronic form. Or in other words, the expert system facilitates the transformation of subject matter expert's (SME) knowledge available with humans into a computer processable knowledge base.

查看原文本刊更多论文

Anusaaraka:基于专家系统的机器翻译系统

大多数关于机器翻译的研究都是关于让计算机完全承担将一种人类语言翻译成另一种语言的重担。本文重新审视了机器翻译问题，并指出需要在人和机器之间分担负荷，从启发式中区分可靠的知识，提供一系列输出以服务于不同阶层的人，最后利用现有资源而不是重新发明轮子。本文以帕尼尼语法形式主义的信息动力学思想为基础，提出了一种开发机器翻译系统的独特方法。Anusaaraka是一个基于共享负载的基本前提下，根据读者的需要产生足够好的翻译结果的语言访问和机器翻译系统。该系统承诺忠实地表示翻译文本，在翻译时不会丢失信息，并且在失败的情况下会有优雅的退化(鲁棒性)。分层输出提供了访问翻译的所有阶段，使整个过程透明。因此，Anusaaraka在两个方面不同于机器翻译系统:(1)它对忠实的承诺，从而提供了一个100%忠实的输出层，这样经过一些训练的用户就可以忠实地“访问源文本”。(2)系统的设计使用户可以对其做出贡献并参与改进其质量。此外，Anusaaraka提供了Apertium架构与前向链专家系统的折衷组合，允许使用深层解析器和浅层解析器输出来分析SL文本。使用GPL下可用的现有语言资源(解析器、标记器、分块器)，而不是再次重写它。语言数据和语言规则独立于核心程序，使得语言学家可以很容易地修改和实验不同的语言现象来改进系统。用户可以通过互联网上提供的web界面提供新的歧义词的词义消歧规则，从而成为贡献者。该系统利用专家系统的前向链，从已有的语言数据中推断出新的语言事实。它通过应用特定的知识而不是特定的技术，以电子的形式建立一个庞大的语言知识库，有助于解决语言翻译的复杂行为。或者换句话说，专家系统有助于将人类可用的主题专家(SME)知识转化为计算机可处理的知识库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)

自引率

0.00%

发文量