{"title":"Algorithm Implementation of Japanese Machine Translation System Based on Similarity of Semantic Distribution","authors":"Li Shan, Itai Misa","doi":"10.1109/ACEDPI58926.2023.00006","DOIUrl":null,"url":null,"abstract":"In intelligent information processing, word similarity calculation based on semantics is a very basic and key problem, which is widely used in information retrieval, machine translation, automatic question answering system, text mining and other fields. As the crystallization of human wisdom, terminology carries the core knowledge of a specific field. There are many algorithms for calculating the similarity of words. Most of the algorithms lack the analysis of various relationships between words. Therefore, when the similarity is quantified, the calculation results of the similarity of words are not accurate enough. As one of the research hotspots in terminology, terminology translation is widely used in machine translation, cross-language information retrieval and bilingual dictionary compilation. This paper mainly analyzes the characteristics of Japanese and Chinese terms, and makes a detailed study on the international patent classification number (IPC) of the patent documents where the terms are located, the Chinese character information contained in the terms, and the collocation information between words in the terms. Then, the above information is used as the feature to fuse with the language model, translation probability and other features, and a Japanese Chinese term automatic translation system based on multi feature fusion is realized by using Moses decoder.","PeriodicalId":124469,"journal":{"name":"2023 Asia-Europe Conference on Electronics, Data Processing and Informatics (ACEDPI)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Asia-Europe Conference on Electronics, Data Processing and Informatics (ACEDPI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACEDPI58926.2023.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In intelligent information processing, word similarity calculation based on semantics is a very basic and key problem, which is widely used in information retrieval, machine translation, automatic question answering system, text mining and other fields. As the crystallization of human wisdom, terminology carries the core knowledge of a specific field. There are many algorithms for calculating the similarity of words. Most of the algorithms lack the analysis of various relationships between words. Therefore, when the similarity is quantified, the calculation results of the similarity of words are not accurate enough. As one of the research hotspots in terminology, terminology translation is widely used in machine translation, cross-language information retrieval and bilingual dictionary compilation. This paper mainly analyzes the characteristics of Japanese and Chinese terms, and makes a detailed study on the international patent classification number (IPC) of the patent documents where the terms are located, the Chinese character information contained in the terms, and the collocation information between words in the terms. Then, the above information is used as the feature to fuse with the language model, translation probability and other features, and a Japanese Chinese term automatic translation system based on multi feature fusion is realized by using Moses decoder.