{"title":"基于意义的内容词对齐启发式","authors":"É. Poirier","doi":"10.1145/3012071.3012074","DOIUrl":null,"url":null,"abstract":"This paper presents a meaning-based content word alignment heuristic within parallel translation segments. The heuristic provides for a manual method of aligning content words in expressions or segments and the principles on which these content-word pairs are aligned. We describe important criteria of alignment: meaning-based similarity (direct or indirect), functional meaning and manual idiom chunking. The null token concept is also used for the representation of language-specific constraints. The purpose of the meaning-based alignment of content words is to represent the distribution of information content between source and target segments. This representation may contribute to the evaluation metrics of translation segments by allowing the identification of even and uneven distribution of information and, for uneven distribution, of lexicogrammatical shifts in translation or other phenomena in the information content transfer. The heuristic is illustrated with parallel English-French segments taken from the bilingual alignment of the inaugural address of the United States President John F. Kennedy delivered in 1961 and published officially in a French translation by the John F. Kennedy Presidential Library and Museum.","PeriodicalId":294250,"journal":{"name":"Proceedings of the 8th International Conference on Management of Digital EcoSystems","volume":"322 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Meaning-based content word alignment heuristic\",\"authors\":\"É. Poirier\",\"doi\":\"10.1145/3012071.3012074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a meaning-based content word alignment heuristic within parallel translation segments. The heuristic provides for a manual method of aligning content words in expressions or segments and the principles on which these content-word pairs are aligned. We describe important criteria of alignment: meaning-based similarity (direct or indirect), functional meaning and manual idiom chunking. The null token concept is also used for the representation of language-specific constraints. The purpose of the meaning-based alignment of content words is to represent the distribution of information content between source and target segments. This representation may contribute to the evaluation metrics of translation segments by allowing the identification of even and uneven distribution of information and, for uneven distribution, of lexicogrammatical shifts in translation or other phenomena in the information content transfer. The heuristic is illustrated with parallel English-French segments taken from the bilingual alignment of the inaugural address of the United States President John F. Kennedy delivered in 1961 and published officially in a French translation by the John F. Kennedy Presidential Library and Museum.\",\"PeriodicalId\":294250,\"journal\":{\"name\":\"Proceedings of the 8th International Conference on Management of Digital EcoSystems\",\"volume\":\"322 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th International Conference on Management of Digital EcoSystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3012071.3012074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3012071.3012074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper presents a meaning-based content word alignment heuristic within parallel translation segments. The heuristic provides for a manual method of aligning content words in expressions or segments and the principles on which these content-word pairs are aligned. We describe important criteria of alignment: meaning-based similarity (direct or indirect), functional meaning and manual idiom chunking. The null token concept is also used for the representation of language-specific constraints. The purpose of the meaning-based alignment of content words is to represent the distribution of information content between source and target segments. This representation may contribute to the evaluation metrics of translation segments by allowing the identification of even and uneven distribution of information and, for uneven distribution, of lexicogrammatical shifts in translation or other phenomena in the information content transfer. The heuristic is illustrated with parallel English-French segments taken from the bilingual alignment of the inaugural address of the United States President John F. Kennedy delivered in 1961 and published officially in a French translation by the John F. Kennedy Presidential Library and Museum.